Re: [GENERAL] BDR question on dboid conflicts

2017-10-27 Thread Zhu, Joshua
Thanks, sounds like that's something unique in my environment/setup.

Here are the results of bdr.bdr_get_local_nodeid() for four nodes in a group,
Node 1: (6480169638493465053,1,16386)
Node 2: (6480169638493465053,1,20225)
Node 3: (6480169638493465053,1,29164)
Node 4: (6480169638493465053,1,20227)

And here is what pg_replication_slots table looks like on Node 4
bdr_20227_6480169638493465053_1_29164__ | bdr| logical   |  20227 | mydb   
| t  |   9603 |  | 7750 | 0/2D4E780   | 0/2D4E7B8
bdr_20227_6480169638493465053_1_16386__ | bdr| logical   |  20227 | mydb   
| t  |   9602 |  | 7750 | 0/2D4E780   | 0/2D4E7B8
bdr_20227_6480169638493465053_1_20225__ | bdr| logical   |  20227 | mydb   
| t  |   9601 |  | 7750 | 0/2D4E780   | 0/2D4E7B8

-Original Message-
From: pgsql-general-ow...@postgresql.org 
[mailto:pgsql-general-ow...@postgresql.org] On Behalf Of Craig Ringer
Sent: Thursday, October 26, 2017 7:24 PM
To: Zhu, Joshua <j...@thalesesec.net>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] BDR question on dboid conflicts

On 27 October 2017 at 01:15, Zhu, Joshua <j...@vormetric.com> wrote:
> Database oid is used in both bdr.bdr_nodes, as node_dboid, and 
> bdr.bdr_connections, as conn_dboid, also used in construction of 
> replication slot names.

Correct. However, it's used in conjunction with the sysid and node timeline ID.

> I noticed that when trying to join a bdr group, if the database oid on 
> the new node happens to be the same as that of an node already in the 
> bdr group, the join would fail, and the only way to resolve the 
> conflict that I was able to come up with has been to retry with 
> dropping/recreating the database until the dboid does not conflict with any 
> node already in the group.

That is extremely surprising. In our regression tests the database oids should 
be the same quite often, as we do various tests where we create multiple 
instances. More importantly, every time you bdr_init_copy, you get a clone with 
the same database oid, and that works fine.

There's no detail here to work from, so I cannot guess what's actually 
happening, but I can confidently say it's not a database oid conflict.
Nowhere in BDR should the database oid be considered without the rest of the 
(sysid,timeline,dboid) tuple.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make 
changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] BDR question on dboid conflicts

2017-10-26 Thread Zhu, Joshua
Database oid is used in both bdr.bdr_nodes, as node_dboid, and 
bdr.bdr_connections, as conn_dboid, also used in construction of replication 
slot names.

I noticed that when trying to join a bdr group, if the database oid on the new 
node happens to be the same as that of an node already in the bdr group, the 
join would fail, and the only way to resolve the conflict that I was able to 
come up with has been to retry with dropping/recreating the database until the 
dboid does not conflict with any node already in the group.

Is there a better way to handle this kind of conflicts, especially doing so in 
a script?

Thanks


Re: [GENERAL] BDR replication port

2017-08-25 Thread Zhu, Joshua
Thanks for the idea, and that is it... it's indeed the other direction of 
replication was affected by blocking a port

-Original Message-
From: Alvaro Aguayo Garcia-Rada [mailto:aagu...@opensysperu.com] 
Sent: Friday, August 25, 2017 5:00 PM
To: Zhu, Joshua <j...@thalesesec.net>
Cc: PostgreSql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] BDR replication port

That's weird. Another idea: Do changes on that server get replicated to the 
other servers? I'm not sure if incomming connections are used to receive WAL or 
to send it.

Regards,

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103  | RPC: (+51) 
954183248
Website: www.ocs.pe

- Original Message -----
From: "Zhu, Joshua" <j...@vormetric.com>
To: "Alvaro Aguayo Garcia-Rada" <aagu...@opensysperu.com>
Cc: "PostgreSql-general" <pgsql-general@postgresql.org>
Sent: Friday, 25 August, 2017 18:35:21
Subject: RE: [GENERAL] BDR replication port

Thought about that possibility, so postgres on the node with port blocked was 
restarted after blocking the port.

-Original Message-
From: Alvaro Aguayo Garcia-Rada [mailto:aagu...@opensysperu.com] 
Sent: Friday, August 25, 2017 3:23 PM
To: Zhu, Joshua <j...@thalesesec.net>
Cc: PostgreSql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] BDR replication port

Just a guess: How did you blocked the port? Depending on that, you could be 
blocking only new connections, but connections already established would 
continue to transmit data; remember BDR only reconnects when connection is lost.

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103  | RPC: (+51) 
954183248
Website: www.ocs.pe

- Original Message -
From: "Zhu, Joshua" <j...@vormetric.com>
To: "PostgreSql-general" <pgsql-general@postgresql.org>
Sent: Friday, 25 August, 2017 16:49:44
Subject: [GENERAL] BDR replication port

Hi, I am experimenting how network configuration impacts BDR replication, ran 
into something that I can't explain, and wonder if someone can shed light.  
Here it goes:

With a four node BDR group configured and running (all using default port 
5432), I purposely blocked port 5432 on one of the node in the group, and was 
expecting to see changes on other nodes stop being replicated to this node, but 
that's not  what happened.

Shell commands show that the port was indeed blocked  (In the following example 
session, the port 5432 is blocked on 10.3.122.31, but open on 10.3.122.21):

% nc -v --send-only 10.3.122.21 5432 http://nmap.org/ncat )
Ncat: Connected to 10.3.122.21:5432.
Ncat: 0 bytes sent, 0 bytes received in 0.00 seconds.

% nc -v --send-only 10.3.122.31 5432 http://nmap.org/ncat )
Ncat: Connection timed out.

% psql -h 10.3.122.21 mydb
psql (9.4.10)
Type "help" for help.
mydb=#

% psql -h 10.3.122.31 mydb
psql: could not connect to server: Connection timed out
Is the server running on host "10.3.122.31" and accepting
TCP/IP connections on port 5432?

At this state, I tried insertion and update on node 10.3.122.21, and all of 
which were replicated to node 10.3.122.31.  However, attempt to create a new 
table on node 10.3.122.21 was stuck (as expected) until the port 5432 on 
10.3.122.31 opened again.

So my question is, is there another port other than port 5432 that BDR uses for 
replication? If not, how could changes be replicated to 10.3.122.31 when its 
port 5432 was blocked?

Thanks,


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] BDR replication port

2017-08-25 Thread Zhu, Joshua
Thought about that possibility, so postgres on the node with port blocked was 
restarted after blocking the port.

-Original Message-
From: Alvaro Aguayo Garcia-Rada [mailto:aagu...@opensysperu.com] 
Sent: Friday, August 25, 2017 3:23 PM
To: Zhu, Joshua <j...@thalesesec.net>
Cc: PostgreSql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] BDR replication port

Just a guess: How did you blocked the port? Depending on that, you could be 
blocking only new connections, but connections already established would 
continue to transmit data; remember BDR only reconnects when connection is lost.

Alvaro Aguayo
Jefe de Operaciones
Open Comb Systems E.I.R.L.

Oficina: (+51-1) 3377813 | RPM: #034252 / (+51) 995540103  | RPC: (+51) 
954183248
Website: www.ocs.pe

- Original Message -----
From: "Zhu, Joshua" <j...@vormetric.com>
To: "PostgreSql-general" <pgsql-general@postgresql.org>
Sent: Friday, 25 August, 2017 16:49:44
Subject: [GENERAL] BDR replication port

Hi, I am experimenting how network configuration impacts BDR replication, ran 
into something that I can't explain, and wonder if someone can shed light.  
Here it goes:

With a four node BDR group configured and running (all using default port 
5432), I purposely blocked port 5432 on one of the node in the group, and was 
expecting to see changes on other nodes stop being replicated to this node, but 
that's not  what happened.

Shell commands show that the port was indeed blocked  (In the following example 
session, the port 5432 is blocked on 10.3.122.31, but open on 10.3.122.21):

% nc -v --send-only 10.3.122.21 5432 http://nmap.org/ncat )
Ncat: Connected to 10.3.122.21:5432.
Ncat: 0 bytes sent, 0 bytes received in 0.00 seconds.

% nc -v --send-only 10.3.122.31 5432 http://nmap.org/ncat )
Ncat: Connection timed out.

% psql -h 10.3.122.21 mydb
psql (9.4.10)
Type "help" for help.
mydb=#

% psql -h 10.3.122.31 mydb
psql: could not connect to server: Connection timed out
Is the server running on host "10.3.122.31" and accepting
TCP/IP connections on port 5432?

At this state, I tried insertion and update on node 10.3.122.21, and all of 
which were replicated to node 10.3.122.31.  However, attempt to create a new 
table on node 10.3.122.21 was stuck (as expected) until the port 5432 on 
10.3.122.31 opened again.

So my question is, is there another port other than port 5432 that BDR uses for 
replication? If not, how could changes be replicated to 10.3.122.31 when its 
port 5432 was blocked?

Thanks,


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] BDR replication port

2017-08-25 Thread Zhu, Joshua

Hi, I am experimenting how network configuration impacts BDR replication, ran 
into something that I can't explain, and wonder if someone can shed light.  
Here it goes:

With a four node BDR group configured and running (all using default port 
5432), I purposely blocked port 5432 on one of the node in the group, and was 
expecting to see changes on other nodes stop being replicated to this node, but 
that's not  what happened.

Shell commands show that the port was indeed blocked  (In the following example 
session, the port 5432 is blocked on 10.3.122.31, but open on 10.3.122.21):

% nc -v --send-only 10.3.122.21 5432 http://nmap.org/ncat )
Ncat: Connected to 10.3.122.21:5432.
Ncat: 0 bytes sent, 0 bytes received in 0.00 seconds.

% nc -v --send-only 10.3.122.31 5432 http://nmap.org/ncat )
Ncat: Connection timed out.

% psql -h 10.3.122.21 mydb
psql (9.4.10)
Type "help" for help.
mydb=#

% psql -h 10.3.122.31 mydb
psql: could not connect to server: Connection timed out
Is the server running on host "10.3.122.31" and accepting
TCP/IP connections on port 5432?

At this state, I tried insertion and update on node 10.3.122.21, and all of 
which were replicated to node 10.3.122.31.  However, attempt to create a new 
table on node 10.3.122.21 was stuck (as expected) until the port 5432 on 
10.3.122.31 opened again.

So my question is, is there another port other than port 5432 that BDR uses for 
replication? If not, how could changes be replicated to 10.3.122.31 when its 
port 5432 was blocked?

Thanks,



[GENERAL] puzzled by deletion performance

2017-07-14 Thread Zhu, Joshua

I have the following (hypothetical) tables and their relationships (primary 
keys are in square brackets):

[server_id] [device_id][sensor_id][property_id]
SERVER  --- 1:n --- DEVICE --- 1:n --- SENSOR --- 1:n --- PROPERTY
   | |
   | m
   | |
   |   MAPPING [mapping_id]
   | |
   | n
   | |
   + - 1:n --- AGENT [agent_id]


They have the following record counts:

SERVER:   10
DEVICE:1 for each server
SENSOR:   12 for devices on each server
AGENT:15 for devices on each server
PROPERTY: 44 for sensors on each server
MAPPING:  45 for sensors and agents on each server

When there is a need to delete all records belonging to a server (let's say of 
server_id 1), the following SQL statements are executed (in that order, each 
with its own transaction):

delete from MAPPING where mapping_id in (select mapping_id from MAPPING where 
sensor_id in
(select sensor_id from SERSOR where device_id in (select device_id from DEVICE 
where server_id = 1))) -- statement 1

delete from PROPERTY where property_id in (select property_id from PROPERTY 
where sensor_id in
(select sensor_id from SENSOR where device_id in (select device_id from DEVICE 
where server_id = 1))) -- statement 2

delete from AGENT where agent_id in (select agent_id from AGENT where device_id 
in
 (select distinct device_id from DEVICE where server_id = 1))) -- statement 3

delete from SENSOR where sensor_id in (select sensor_id from SENSOR where 
device_id in
 (select device_id from DEVICE where server_id = 1)) -- statement 4

delete from DEVICE where device_id in (select device_id from DEVICE where 
server_id = 1) -- statement 5

delete from SERVER where server_id = 1  -- statement 6

The first 3 statements completed fairly quickly, however, the statement 4 takes 
VERY SIGNIFICANTLY longer time to execute, which is puzzling, especially 
comparing it to statement 3, the latter actually has more records to delete, 
and the execution plan according to "explain" for practically identical (only 
that statement 3 with more rows/slightly higher cost).

Anyone can shed some light on this behavior, or suggestions on how statement 4 
can be rewritten, with better performance (there is already an index in 
PROPERTY table on its foreign key sensor_id)?

Thanks



Re: [GENERAL] BDR node removal and rejoin

2017-07-13 Thread Zhu, Joshua

Found these log entries from one of the other node:

t=2017-07-13 08:35:34 PDT p=27292 a=DEBUG:  0: found valid replication 
identifier 15
t=2017-07-13 08:35:34 PDT p=27292 a=LOCATION:  
bdr_establish_connection_and_slot, bdr.c:604
t=2017-07-13 08:35:34 PDT p=27292 a=ERROR:  53400: no free replication state 
could be found for 15, increase max_replication_slots

Increased max_replication_slots, things are looking good now, thanks.

This does bring up a couple of questions:


  1.  Given the fact there is no real increase in the number of nodes in this 
repeated removal/rejoining exercise, yet it caused replication slots being used 
up, wouldn’t removal of a node also automatically free up the replication slot 
allocated for the node? Or is there a way to manually free up no longer needed 
slots? (the don’t seem to show up in pg_replication_slots view, I made sure to 
use pg_drop_replication_slot when they do show up there)
  2.  If there is such a thing, what is the rule of thumb for best value of 
max_replication_slots (are they somehow related to the value max_wal_senders as 
well), with respect to, say, the max number of nodes intended to support?

Thanks

From: Craig Ringer [mailto:cr...@2ndquadrant.com]
Sent: Wednesday, July 12, 2017 11:59 PM
To: Zhu, Joshua <j...@thalesesec.net>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] BDR node removal and rejoin

On 13 July 2017 at 01:56, Zhu, Joshua 
<j...@vormetric.com<mailto:j...@vormetric.com>> wrote:
Thanks for the clarification.

Looks like I am running into a different issue: while trying to pin down 
precisely the steps (and the order in which to perform them) needed to 
remove/rejoin a node, the removal/rejoining exercise was repeated a number of 
times, and stuck again:


  1.  The status of the re-joining node (node4) on other nodes is “I”
  2.  The status of the re-joining node on the node4 itself started at “I”, 
changed to “o”, then stuck there
  3.  From the log file for node4, the following entries are constantly being 
generated:

2017-07-12 10:37:46 PDT [24943:bdr 
(6334686800251932108,1,43865,):receive:::1(33883)]DEBUG:  0: received 
replication command: IDENTIFY_SYSTEM
2017-07-12 10:37:46 PDT [24943:bdr 
(6334686800251932108,1,43865,):receive:::1(33883)]LOCATION:  
exec_replication_command, walsender.c:1309
2017-07-12 10:37:46 PDT [24943:bdr 
(6334686800251932108,1,43865,):receive:::1(33883)]DEBUG:  08003: unexpected EOF 
on client connection
2017-07-12 10:37:46 PDT [24943:bdr 
(6334686800251932108,1,43865,):receive:::1(33883)]LOCATION:  SocketBackend, 
postgres.c:355
2017-07-12 10:37:46 PDT [24944:bdr 
(6408408103171110238,1,24713,):receive:::1(33884)]DEBUG:  0: received 
replication command: IDENTIFY_SYSTEM
2017-07-12 10:37:46 PDT [24944:bdr 
(6408408103171110238,1,24713,):receive:::1(33884)]LOCATION:  
exec_replication_command, walsender.c:1309
2017-07-12 10:37:46 PDT [24944:bdr 
(6408408103171110238,1,24713,):receive:::1(33884)]DEBUG:  08003: unexpected EOF 
on client connection

Check the logs on the other end.



--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [GENERAL] BDR node removal and rejoin

2017-07-12 Thread Zhu, Joshua
Thanks for the clarification.

Looks like I am running into a different issue: while trying to pin down 
precisely the steps (and the order in which to perform them) needed to 
remove/rejoin a node, the removal/rejoining exercise was repeated a number of 
times, and stuck again:


  1.  The status of the re-joining node (node4) on other nodes is “I”
  2.  The status of the re-joining node on the node4 itself started at “I”, 
changed to “o”, then stuck there
  3.  From the log file for node4, the following entries are constantly being 
generated:

2017-07-12 10:37:46 PDT [24943:bdr 
(6334686800251932108,1,43865,):receive:::1(33883)]DEBUG:  0: received 
replication command: IDENTIFY_SYSTEM
2017-07-12 10:37:46 PDT [24943:bdr 
(6334686800251932108,1,43865,):receive:::1(33883)]LOCATION:  
exec_replication_command, walsender.c:1309
2017-07-12 10:37:46 PDT [24943:bdr 
(6334686800251932108,1,43865,):receive:::1(33883)]DEBUG:  08003: unexpected EOF 
on client connection
2017-07-12 10:37:46 PDT [24943:bdr 
(6334686800251932108,1,43865,):receive:::1(33883)]LOCATION:  SocketBackend, 
postgres.c:355
2017-07-12 10:37:46 PDT [24944:bdr 
(6408408103171110238,1,24713,):receive:::1(33884)]DEBUG:  0: received 
replication command: IDENTIFY_SYSTEM
2017-07-12 10:37:46 PDT [24944:bdr 
(6408408103171110238,1,24713,):receive:::1(33884)]LOCATION:  
exec_replication_command, walsender.c:1309
2017-07-12 10:37:46 PDT [24944:bdr 
(6408408103171110238,1,24713,):receive:::1(33884)]DEBUG:  08003: unexpected EOF 
on client connection
2017-07-12 10:37:46 PDT [24944:bdr 
(6408408103171110238,1,24713,):receive:::1(33884)]LOCATION:  SocketBackend, 
postgres.c:355
2017-07-12 10:37:46 PDT [24946:bdr 
(6334686760735153516,1,43845,):receive:::1(33885)]DEBUG:  0: received 
replication command: IDENTIFY_SYSTEM
2017-07-12 10:37:46 PDT [24946:bdr 
(6334686760735153516,1,43845,):receive:::1(33885)]LOCATION:  
exec_replication_command, walsender.c:1309
2017-07-12 10:37:46 PDT [24946:bdr 
(6334686760735153516,1,43845,):receive:::1(33885)]DEBUG:  08003: unexpected EOF 
on client connection
2017-07-12 10:37:46 PDT [24946:bdr 
(6334686760735153516,1,43845,):receive:::1(33885)]LOCATION:  SocketBackend, 
postgres.c:355
2017-07-12 10:37:49 PDT [24949:bdr 
(6394432535408825526,1,37325,):receive:::1(33892)]DEBUG:  0: received 
replication command: IDENTIFY_SYSTEM
2017-07-12 10:37:49 PDT [24949:bdr 
(6394432535408825526,1,37325,):receive:::1(33892)]LOCATION:  
exec_replication_command, walsender.c:1309
2017-07-12 10:37:49 PDT [24949:bdr 
(6394432535408825526,1,37325,):receive:::1(33892)]DEBUG:  08003: unexpected EOF 
on client connection
2017-07-12 10:37:49 PDT [24949:bdr 
(6394432535408825526,1,37325,):receive:::1(33892)]LOCATION:  SocketBackend, 
postgres.c:355

What do these entries say? and what can be done to correct the situation (there 
have been no change with respect to either postgres  or network configuration 
in the remove/rejoin exercise)?

Thanks

From: Craig Ringer [mailto:cr...@2ndquadrant.com]
Sent: Wednesday, July 12, 2017 1:59 AM
To: Zhu, Joshua <j...@thalesesec.net>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] BDR node removal and rejoin

On 11 July 2017 at 05:49, Zhu, Joshua 
<j...@vormetric.com<mailto:j...@vormetric.com>> wrote:
An update… after manually removing the record for ‘node4’ from bdr.bdr_nodes, 
corresponding record in bdr.bdr_connections, and associated replication slot 
(with pg_drop_replication_slot), rejoining was successful.

I was under the impression that there is no need to perform manual cleanup 
before a removed node (with database dropped and recreated) rejoining a BDR 
group.

BDR1 requires that you manually remove the bdr.bdr_nodes entry if you intend to 
re-use the same node name.


--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [GENERAL] BDR node removal and rejoin

2017-07-10 Thread Zhu, Joshua
An update... after manually removing the record for 'node4' from bdr.bdr_nodes, 
corresponding record in bdr.bdr_connections, and associated replication slot 
(with pg_drop_replication_slot), rejoining was successful.

I was under the impression that there is no need to perform manual cleanup 
before a removed node (with database dropped and recreated) rejoining a BDR 
group.

From: Zhu, Joshua
Sent: Friday, July 07, 2017 2:59 PM
To: 'pgsql-general@postgresql.org' <pgsql-general@postgresql.org>
Subject: BDR node removal and rejoin

Hi, I am having difficulty removing a node from a BDR group (with nodes node1 
through node5) then rejoin the group.

Prior to removing a node, the BDR is running fine, query on bdr.bdr_nodes table 
shows all nodes having the status 'r'.

Here is what I have done for removing node5 and rejoining:


  *   On node1, do bdr.bdr_part_by_node_names

At this point the status of node5 in bdr.bdr_nodes becomes 'k'


  *   On node5, do bdr.remove_bdr_from_local_node
  *   On node5, drop and recreate the database, then rejoin using 
bdr.bdr_group_join (using the same node name and external dsn)

At this point the status of node5 on node1 though node4 still remains 'k', and 
the status of node5 on node5 (there is only one record) is 'i', and they stuck 
at these status codes.
[note: I tried using a different node name on rejoining, same result]

What have I done wrong, what is the correct way of doing removal and rejoining?

Thanks




[GENERAL] BDR node removal and rejoin

2017-07-07 Thread Zhu, Joshua
Hi, I am having difficulty removing a node from a BDR group (with nodes node1 
through node5) then rejoin the group.

Prior to removing a node, the BDR is running fine, query on bdr.bdr_nodes table 
shows all nodes having the status 'r'.

Here is what I have done for removing node5 and rejoining:


  *   On node1, do bdr.bdr_part_by_node_names

At this point the status of node5 in bdr.bdr_nodes becomes 'k'


  *   On node5, do bdr.remove_bdr_from_local_node
  *   On node5, drop and recreate the database, then rejoin using 
bdr.bdr_group_join (using the same node name and external dsn)

At this point the status of node5 on node1 though node4 still remains 'k', and 
the status of node5 on node5 (there is only one record) is 'i', and they stuck 
at these status codes.
[note: I tried using a different node name on rejoining, same result]

What have I done wrong, what is the correct way of doing removal and rejoining?

Thanks




Re: [GENERAL] How does BDR replicate changes among nodes in a BDR group

2017-06-08 Thread Zhu, Joshua
Thanks for the clarification.

A follow up question, then, given *once joined all nodes are equal*, is that:

should the node A dies or taken out of the group, the remaining three node 
group (with B, C and D) would continue to function properly, correct?
[somewhere I saw the term "downstream" nodes was used, and I am not clear what 
that meant in the context of a mesh-connected group] 

Thanks again

-Original Message-
From: Craig Ringer [mailto:cr...@2ndquadrant.com] 
Sent: Wednesday, June 07, 2017 5:59 PM
To: Zhu, Joshua <j...@thalesesec.net>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] How does BDR replicate changes among nodes in a BDR group

On 8 June 2017 at 04:50, Zhu, Joshua <j...@vormetric.com> wrote:

> How does BDR replicate a change delta on A to B, C, and D?

It's a mesh.

Once joined, it doesn't matter what the join node was, all nodes are equal.

> e.g., A
> replicates delta to B and D, and B to C, or some other way, or not 
> statically determined?

Each node replicates to all other nodes in an undefined order determined by 
network timing etc.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] How does BDR replicate changes among nodes in a BDR group

2017-06-07 Thread Zhu, Joshua
New to this group, so if this is not the right place to ask this question or it 
has been asked before/documented, please kindly point me the right group or the 
right thread/documentation, thanks.

A BDR novice, I would like to know how BDR replicate changes among nodes in a 
BDR group, let's say I have a 4 node group consisting of A, B, C and D, 
established as follows

A is the initial node
B joins via A
C joins via B
D joins via A

How does BDR replicate a change delta on A to B, C, and D? e.g., A replicates 
delta to B and D, and B to C, or some other way, or not statically determined?
How about a change delta on B to A, C and D? e.g., B replicates delta to A and 
C, A to D, or?
How about a change delta on C to A, B and D? e.g., C replicates delta to B, B 
to A, A to D, or?
How about a change delta on D to A, B and C? e.g., D replicate delta to A, A to 
B, B to A, or?

Thanks