On 05/09/16 23:35, Steve Singer wrote:
On 09/05/2016 03:58 PM, Steve Singer wrote:
On 08/31/2016 04:51 PM, Petr Jelinek wrote:
Hi,

and one more version with bug fixes, improved code docs and couple
more tests, some general cleanup and also rebased on current master
for the start of CF.





A few more things I noticed when playing with the patches

1, Creating a subscription to yourself ends pretty badly,
the 'CREATE SUBSCRIPTION' command seems to get stuck, and you can't kill
it.  The background process seems to be waiting for a transaction to
commit (I assume the create subscription command).  I had to kill -9 the
various processes to get things to stop.  Getting confused about
hostnames and ports is a common operator error.


Hmm I guess there is missing interrupts check, will look. It would be great to detect it properly but I am not really sure how to do that as afaik there is no accurate way to detect that the connection is to yourself.

2. Failures during the initial subscription  aren't recoverable

For example

on db1
  create table a(id serial4 primary key,b text);
  insert into a(b) values ('1');
  create publication testpub for table a;

on db2
  create table a(id serial4 primary key,b text);
  insert into a(b) values ('1');
  create subscription testsub connection 'host=localhost port=5440
dbname=test' publication testpub;

I then get in my db2 log

ERROR:  duplicate key value violates unique constraint "a_pkey"
DETAIL:  Key (id)=(1) already exists.
LOG:  worker process: logical replication worker 16396 sync 16387 (PID
10583) exited with exit code 1
LOG:  logical replication sync for subscription testsub, table a started
ERROR:  could not crate replication slot "testsub_sync_a": ERROR:
replication slot "testsub_sync_a" already exists


LOG:  worker process: logical replication worker 16396 sync 16387 (PID
10585) exited with exit code 1
LOG:  logical replication sync for subscription testsub, table a started
ERROR:  could not crate replication slot "testsub_sync_a": ERROR:
replication slot "testsub_sync_a" already exists


and it keeps looping.
If I then truncate "a" on db2 it doesn't help. (I'd expect at that point
the initial subscription to work)

Hmm, looks like the error case does not cleanup correctly after itself.


If I then do on db2
 drop subscription testsub cascade;

I still see a slot in use on db1

select * FROM pg_replication_slots ;
   slot_name    |  plugin  | slot_type | datoid | database | active |
active_pid | xmin | catalog_xmin | rest
art_lsn | confirmed_flush_lsn
----------------+----------+-----------+--------+----------+--------+------------+------+--------------+-----

--------+---------------------
 testsub_sync_a | pgoutput | logical   |  16384 | test     | f
|            |      |         1173 | 0/15
66E08   | 0/1566E40


Same as above.

--
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to