Hello slony group,
                I’m testing now with slony1-2.2.4. I have just recently 
produced an error which effectively stops slon processing on some node A due to 
some node B being dropped. The event reproduces only infrequently. As some will 
know, a slon daemon for a given node which becomes aware its node has been 
dropped will respond by dropping its cluster schema. There appears to be a race 
condition between the node B schema drop and the (surviving) node A receipt of 
the disableNode (drop node) event. If the former occurs before the latter, all 
the remote worker threads on node A enter an error state. See the log samples 
below. I resolved this the first time by deleting all the recent non-SYNC 
events from the sl_event tables, and more recently with a simple node A slon 
restart.
                Please advise if there is any ticket I should provide this info 
to, or if I should create a new one. Thanks.


---- node 1 log ----
2016-07-08 18:06:31 UTC [30382] INFO   remoteWorkerThread_999999: SYNC 
5000000008 done in 0.002 seconds
2016-07-08 18:06:33 UTC [30382] INFO   remoteWorkerThread_999999: SYNC 
5000000009 done in 0.002 seconds
2016-07-08 18:06:33 UTC [30382] INFO   remoteWorkerThread_2: SYNC 5000017869 
done in 0.002 seconds
2016-07-08 18:06:33 UTC [30382] INFO   remoteWorkerThread_3: SYNC 5000018148 
done in 0.004 seconds
2016-07-08 18:06:45 UTC [30382] CONFIG remoteWorkerThread_2: update provider 
configuration
2016-07-08 18:06:45 UTC [30382] ERROR  remoteWorkerThread_3: "select last_value 
from "_ams_cluster".sl_log_status" PGRES_FATAL_ERROR ERROR:  schema "_ams_clu\
ster" does not exist
LINE 1: select last_value from "_ams_cluster".sl_log_status
                               ^

2016-07-08 18:06:45 UTC [30382] ERROR  remoteWorkerThread_3: SYNC aborted
2016-07-08 18:06:45 UTC [30382] CONFIG version for "dbname=ams
      host=198.18.102.45
      user=ams_slony
      sslmode=verify-ca
      sslcert=/usr/local/akamai/.ams_certs/complete-ams_slony.crt
      sslkey=/usr/local/akamai/.ams_certs/ams_slony.private_key
      sslrootcert=/usr/local/akamai/etc/ssl_ca/canonical_ca_roots.pem" is 90119
2016-07-08 18:06:45 UTC [30382] ERROR  remoteWorkerThread_2: "select last_value 
from "_ams_cluster".sl_log_status" PGRES_FATAL_ERROR ERROR:  schema "_ams_clu\
ster" does not exist
LINE 1: select last_value from "_ams_cluster".sl_log_status
                               ^

2016-07-08 18:06:45 UTC [30382] ERROR  remoteWorkerThread_2: SYNC aborted
2016-07-08 18:06:45 UTC [30382] ERROR  remoteListenThread_999999: "select 
ev_origin, ev_seqno, ev_timestamp,        ev_snapshot,        
"pg_catalog".txid_sna\
pshot_xmin(ev_snapshot),        "pg_catalog".txid_snapshot_xmax(ev_snapshot),   
     ev_type,        ev_data1, ev_data2,        ev_data3, ev_data4,        ev\
_data5, ev_data6,        ev_data7, ev_data8 from "_ams_cluster".sl_event e 
where (e.ev_origin = '999999' and e.ev_seqno > '5000000009') or (e.ev_origin = 
'2'\
and e.ev_seqno > '5000017870') or (e.ev_origin = '3' and e.ev_seqno > 
'5000018151') order by e.ev_origin, e.ev_seqno limit 40" - ERROR:  schema 
"_ams_cluste\
r" does not exist
LINE 1: ...v_data5, ev_data6,        ev_data7, ev_data8 from "_ams_clus...
                                                             ^
2016-07-08 18:06:55 UTC [30382] ERROR  remoteWorkerThread_3: "start 
transaction; set enable_seqscan = off; set enable_indexscan = on; " 
PGRES_FATAL_ERROR ERR\
OR:  current transaction is aborted, commands ignored until end of transaction 
block
2016-07-08 18:06:55 UTC [30382] ERROR  remoteWorkerThread_3: SYNC aborted
2016-07-08 18:06:55 UTC [30382] ERROR  remoteWorkerThread_2: "start 
transaction; set enable_seqscan = off; set enable_indexscan = on; " 
PGRES_FATAL_ERROR ERR\
OR:  current transaction is aborted, commands ignored until end of transaction 
block
2016-07-08 18:06:55 UTC [30382] ERROR  remoteWorkerThread_2: SYNC aborted
----


---- node 999999 log ----
2016-07-08 18:06:44 UTC [558] INFO   remoteWorkerThread_1: SYNC 5000081216 done 
in 0.004 seconds
2016-07-08 18:06:44 UTC [558] INFO   remoteWorkerThread_2: SYNC 5000017870 done 
in 0.004 seconds
2016-07-08 18:06:44 UTC [558] INFO   remoteWorkerThread_3: SYNC 5000018150 done 
in 0.004 seconds
2016-07-08 18:06:44 UTC [558] INFO   remoteWorkerThread_1: SYNC 5000081217 done 
in 0.003 seconds
2016-07-08 18:06:44 UTC [558] WARN   remoteWorkerThread_3: got DROP NODE for 
local node ID
NOTICE:  Slony-I: Please drop schema "_ams_cluster"
NOTICE:  drop cascades to 171 other objects
DETAIL:  drop cascades to table _ams_cluster.sl_node
drop cascades to table _ams_cluster.sl_nodelock
drop cascades to table _ams_cluster.sl_set
drop cascades to table _ams_cluster.sl_setsync
drop cascades to table _ams_cluster.sl_table
----

            Tom    ☺



_______________________________________________
Slony1-general mailing list
Slony1-general@lists.slony.info
http://lists.slony.info/mailman/listinfo/slony1-general

Reply via email to