Here is what I did (node 1 = master, node 8 = slave):
$ cat doit.slon
cluster name = adfrontiers;
node 1 admin conninfo='host=10.2.1.10 dbname=af_db_main user=slony
port=5432';
node 8 admin conninfo='host=10.2.1.20 dbname=af_db_main user=slony
port=5432';
execute script (
set id = 2,
filename = '/home/slony/2/doit.sql',
event node = 1
);
$ cat doit.sql
ALTER TABLE c_m
ADD COLUMN bytes INT4 NOT NULL DEFAULT '0';
$ cat doit.slon | slonik
<stdin>:4: PGRES_FATAL_ERROR select "_adfrontiers".ddlScript_prepare(2,
-1); - ERROR: deadlock detected
DETAIL: Process 14876 waits for AccessExclusiveLock on relation 20254
of database 20251; blocked by process 14911.
Process 14911 waits for AccessShareLock on relation 20667 of database
20251; blocked by process 14876.
CONTEXT: SQL statement "lock table "public"."account" in access
exclusive mode"
PL/pgSQL function "altertablerestore" line 47 at execute statement
SQL statement "SELECT "_adfrontiers".alterTableRestore(tab_id) from
"_adfrontiers".sl_table where tab_set in (select set_id from
"_adfrontiers".sl_set where set_origin =
"_adfrontiers".getLocalNodeId('_adfrontiers'))"
PL/pgSQL function "ddlscript_prepare" line 30 at perform
*
* Error. But the slony logs seem just fine. Ran it again.
*
[EMAIL PROTECTED] 2]$ cat doit.slon | slonik
DDL script consisting of 1 SQL statements
DDL Statement 0: (0,73) [ALTER TABLE c_m
ADD COLUMN bytes INT4 NOT NULL DEFAULT '0';]
Submit DDL Event to subscribers...
DDL on origin - PGRES_TUPLES_OK
*
* Reports no problem. But the logs on the slave go haywire, this repeats
over and over:
*
CONTEXT: SQL statement "SELECT
"_adfrontiers".alterTableForReplication( $1 )"
PL/pgSQL function "ddlscript_complete_int" line 11 at perform
2007-03-02 05:37:10 GMT DEBUG2 slon_retry() from pid=22438
2007-03-02 05:37:10 GMT DEBUG1 slon: retry requested
2007-03-02 05:37:10 GMT DEBUG2 slon: notify worker process to shutdown
2007-03-02 05:37:10 GMT INFO remoteListenThread_1: disconnecting from
'host=10.2.1.10 dbname=af_db_main user=slony port=5432'
2007-03-02 05:37:10 GMT DEBUG1 remoteListenThread_1: thread done
2007-03-02 05:37:10 GMT DEBUG1 cleanupThread: thread done
2007-03-02 05:37:10 GMT DEBUG1 localListenThread: thread done
2007-03-02 05:37:10 GMT DEBUG1 syncThread: thread done
2007-03-02 05:37:10 GMT DEBUG1 main: scheduler mainloop returned
2007-03-02 05:37:10 GMT DEBUG2 main: wait for remote threads
2007-03-02 05:37:10 GMT DEBUG2 sched_wakeup_node(): no_id=1 (0 threads +
worker signaled)
2007-03-02 05:37:10 GMT DEBUG1 main: done
2007-03-02 05:37:10 GMT DEBUG2 slon: child terminated status: 0; pid:
22438, current worker pid: 22438
2007-03-02 05:37:10 GMT DEBUG1 slon: restart of worker
2007-03-02 05:37:10 GMT CONFIG main: slon version 1.2.6 starting up
2007-03-02 05:37:10 GMT DEBUG2 slon: watchdog process started
2007-03-02 05:37:10 GMT DEBUG2 slon: watchdog ready - pid = 22436
2007-03-02 05:37:10 GMT DEBUG2 slon: worker process created - pid = 5580
2007-03-02 05:37:10 GMT CONFIG main: local node id = 8
2007-03-02 05:37:10 GMT DEBUG2 main: main process started
2007-03-02 05:37:10 GMT CONFIG main: launching sched_start_mainloop
2007-03-02 05:37:10 GMT CONFIG main: loading current cluster
configuration
2007-03-02 05:37:10 GMT CONFIG storeNode: no_id=1 no_comment='Node 1 -
[EMAIL PROTECTED]'
2007-03-02 05:37:10 GMT DEBUG2 setNodeLastEvent: no_id=1 event_seq=56122
2007-03-02 05:37:10 GMT CONFIG storePath: pa_server=1 pa_client=8
pa_conninfo="host=10.2.1.10 dbname=af_db_main user=slony port=5432"
pa_connretry=10
2007-03-02 05:37:10 GMT CONFIG storeListen: li_origin=1 li_receiver=8
li_provider=1
2007-03-02 05:37:10 GMT CONFIG storeSet: set_id=1 set_origin=1
set_comment='Set 1 for adfrontiers'
2007-03-02 05:37:10 GMT WARN remoteWorker_wakeup: node 1 - no worker
thread
2007-03-02 05:37:10 GMT DEBUG2 sched_wakeup_node(): no_id=1 (0 threads +
worker signaled)
2007-03-02 05:37:10 GMT CONFIG storeSet: set_id=2 set_origin=1
set_comment='Set 2 for adfrontiers'
2007-03-02 05:37:10 GMT WARN remoteWorker_wakeup: node 1 - no worker
thread
2007-03-02 05:37:10 GMT DEBUG2 sched_wakeup_node(): no_id=1 (0 threads +
worker signaled)
...
2007-03-02 05:39:58 GMT CONFIG remoteWorkerThread_1: DDL Statement 0:
[ALTER TABLE c_m
ADD COLUMN bytes INT4 NOT NULL DEFAULT '0';]
2007-03-02 05:39:58 GMT CONFIG DDL success - PGRES_COMMAND_OK
2007-03-02 05:39:58 GMT ERROR remoteWorkerThread_1: "select
"_adfrontiers".ddlScript_complete_int(2, -1); notify
"_adfrontiers_Event"; notify "_adfrontiers_Confirm"; insert into
"_adfrontiers".sl_event (ev_origin, ev_seqno, ev_timestamp,
ev_minxid, ev_maxxid, ev_xip, ev_type , ev_data1, ev_data2,
ev_data3 ) values ('1', '56123', '2007-03-02 05:37:04.861017',
'13928700', '13928701', '', 'DDL_SCRIPT', '2', 'ALTER TABLE c_m
ADD COLUMN bytes INT4 NOT NULL DEFAULT ''0'';
', '-1'); insert into "_adfrontiers".sl_confirm (con_origin,
con_received, con_seqno, con_timestamp) values (1, 8, '56123',
now()); commit transaction;" PGRES_FATAL_ERROR ERROR: Slony-I:
alterTableForReplication(): Table "public"."account" is already in
altered state
CONTEXT: SQL statement "SELECT
"_adfrontiers".alterTableForReplication( $1 )"
PL/pgSQL function "ddlscript_complete_int" line 11 at perform
2007-03-02 05:39:58 GMT DEBUG2 slon_retry() from pid=17929
...
2007-03-02 05:39:58 GMT DEBUG1 slon: retry requested
2007-03-02 05:39:58 GMT DEBUG2 remoteListenThread_1: queue event 1,56141
SYNC
2007-03-02 05:39:58 GMT CONFIG remoteWorkerThread_1: DDL request with 1
statements
2007-03-02 05:39:58 GMT CONFIG remoteWorkerThread_1: DDL Statement 0:
[ALTER TABLE c_m
ADD COLUMN bytes INT4 NOT NULL DEFAULT '0';]
2007-03-02 05:39:58 GMT CONFIG DDL success - PGRES_COMMAND_OK
2007-03-02 05:39:58 GMT ERROR remoteWorkerThread_1: "select
"_adfrontiers".ddlScript_complete_int(2, -1); notify
"_adfrontiers_Event"; notify "_adfrontiers_Confirm"; insert into
"_adfrontiers".sl_event (ev_origin, ev_seqno, ev_timestamp,
ev_minxid, ev_maxxid, ev_xip, ev_type , ev_data1, ev_data2,
ev_data3 ) values ('1', '56123', '2007-03-02 05:37:04.861017',
'13928700', '13928701', '', 'DDL_SCRIPT', '2', 'ALTER TABLE c_m
ADD COLUMN bytes INT4 NOT NULL DEFAULT ''0'';
', '-1'); insert into "_adfrontiers".sl_confirm (con_origin,
con_received, con_seqno, con_timestamp) values (1, 8, '56123',
now()); commit transaction;" PGRES_FATAL_ERROR ERROR: Slony-I:
alterTableForReplication(): Table "public"."account" is already in
altered state
CONTEXT: SQL statement "SELECT
"_adfrontiers".alterTableForReplication( $1 )"
PL/pgSQL function "ddlscript_complete_int" line 11 at perform
2007-03-02 05:39:58 GMT DEBUG2 slon_retry() from pid=17903
2007-03-02 05:39:58 GMT DEBUG1 slon: retry requested
2007-03-02 05:39:58 GMT DEBUG2 slon: notify worker process to shutdown
2007-03-02 05:39:58 GMT DEBUG1 main: scheduler mainloop returned
2007-03-02 05:39:58 GMT DEBUG2 main: wait for remote threads
2007-03-02 05:39:58 GMT DEBUG2 sched_wakeup_node(): no_id=1 (0 threads +
worker signaled)
2007-03-02 05:39:58 GMT DEBUG1 syncThread: thread done
2007-03-02 05:39:58 GMT DEBUG1 cleanupThread: thread done
--
Looking at the c_m table, the change took effect on the master, not the
slave. I restart the slon daemon on the slave and it remains in this
state of spewing out these error messages.
It looks like slony is trying to set replication up for the table and it
assumes it was disabled for exlusive access. But apparently that is not
the case and it raises an exception. Does anyone have insight into this?
I am going to reinit the master and slave. How should this be handled in
the future? If I blow away the slave via "uninstall node", will I have
to also uninstall the master since the master might still think that the
slave is subscribed to its sets?
Thanks,
Brett
_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general