On 2/1/2007 9:39 AM, Andrzej Kwiatkowski wrote:
> This is my first post here, so greets for all.
> And my problem with failover.
> 
> Idea of my tests is:
> 1,2 master & slave for database (switched in case of failuer)
> 3 - replica for some reports and so on.
> 
> I'm starting my test slony cluster with:
> 
> 
> 
> cluster name = test;
> 
> node 1 admin conninfo = 'dbname=isp host=localhost port=5432 user=pgsql';
> node 2 admin conninfo = 'dbname=isp host=localhost port=6000 user=pgsql';
> node 3 admin conninfo = 'dbname=isp host=localhost port=6001 user=pgsql';
> 
> init cluster ( id = 1, comment= 'MASTER');
> 
> store node (id=2, comment = 'SLAVE');
> store node (id=3, comment = 'REPLICA');
> 
> # between master
> store path (server = 1, client = 2, conninfo ='dbname=isp host=localhost 
> port=5432 user=pgsql' );
> store path (server = 2, client = 1, conninfo ='dbname=isp host=localhost 
> port=6000 user=pgsql' );
> 
> # from master to replica
> store path (server = 1, client = 3, conninfo ='dbname=isp host=localhost 
> port=5432 user=pgsql' );
> store path (server = 2, client = 3, conninfo ='dbname=isp host=localhost 
> port=6000 user=pgsql' );
> 
> # from replica to master
> store path (server = 3, client = 1, conninfo ='dbname=isp host=localhost 
> port=6001 user=pgsql' );
> store path (server = 3, client = 2, conninfo ='dbname=isp host=localhost 
> port=6001 user=pgsql' );
> 
> #
> # between masters
> store listen (origin=1, receiver=2, provider=1);
> store listen (origin=2, receiver=1, provider=2);
> 
> # from master to replica
> store listen (origin=1, receiver=3, provider=1);
> store listen (origin=2, receiver=3, provider=1);
> store listen (origin=1, receiver=3, provider=2);
> store listen (origin=2, receiver=3, provider=2);
> 
> # from replica to masters
> store listen (origin=3, receiver=1, provider=3);
> store listen (origin=3, receiver=2, provider=1);
> 
> store listen (origin=3, receiver=1, provider=2);
> store listen (origin=3, receiver=2, provider=3);
> 
> create set (id=1, origin=1, comment='FOR SLAVE');
> set add table (set id = 1, origin = 1, id = 1, full qualified name = 
> 'public.test', comment = '');
> set add sequence (set id = 1, origin = 1, id = 2, full qualified name = 
> 'public.test_id_seq', comment = '');
> 
> subscribe set (id=1, provider=1, receiver=2, forward=yes);
> subscribe set (id=1, provider=1, receiver=3, forward=no);
> 
> 
> and then starting slon daemons.
> 
> Everything is ok, replication is done, moving master is ok
> with lock and move set.
> 
> But when i kill postmaster and slon daemon for database id 1
> and execute script
> 
> cluster name = test;
> 
> node 1 admin conninfo = 'dbname=isp host=localhost port=5432 user=pgsql';
> node 2 admin conninfo = 'dbname=isp host=localhost port=6000 user=pgsql';
> node 3 admin conninfo = 'dbname=isp host=localhost port=6001 user=pgsql';
> 
> ### MOVE FROM SLAVE TO MASTER
> #
> 
> failover (id=1, backup node = 2);
> drop node (id = 1, event node = 2);

Do not drop the node instantly after issuing the failover. Wait until 
all other nodes have finished processing the failover and accept set.


Jan

> 
> with output:
> 
> <stdin>:10: NOTICE:  failedNode: set 1 has other direct receivers - 
> change providers only
> <stdin>:10: NOTICE:  failedNode: set 1 has other direct receivers - 
> change providers only
> IMPORTANT: Last known SYNC for set 1 = 21
> 
> 
> Nothing is done, and slon daemon output looks like:
> 
> 2007-02-01 15:36:11 CET DEBUG2 syncThread: new sl_action_seq 1 - SYNC 14
> 2007-02-01 15:36:11 CET DEBUG2 remoteListenThread_2: queue event 2,15 
> ACCEPT_SET
> 2007-02-01 15:36:11 CET DEBUG2 remoteListenThread_2: queue event 2,16 
> DROP_NODE
> 2007-02-01 15:36:11 CET DEBUG2 remoteWorkerThread_2: Received event 2,15 
> ACCEPT_SET
> 2007-02-01 15:36:11 CET DEBUG2 start processing ACCEPT_SET
> 2007-02-01 15:36:11 CET DEBUG2 ACCEPT: set=1
> 2007-02-01 15:36:11 CET DEBUG2 ACCEPT: old origin=1
> 2007-02-01 15:36:11 CET DEBUG2 ACCEPT: new origin=2
> 2007-02-01 15:36:11 CET DEBUG2 ACCEPT: move set seq=22
> 2007-02-01 15:36:11 CET DEBUG2 got parms ACCEPT_SET
> 2007-02-01 15:36:11 CET DEBUG2 ACCEPT_SET - node not origin
> 2007-02-01 15:36:11 CET DEBUG2 ACCEPT_SET - MOVE_SET or FAILOVER_SET not 
> received yet - sleep
> 2007-02-01 15:36:13 CET DEBUG2 remoteListenThread_2: queue event 2,17 SYNC
> 2007-02-01 15:36:17 CET DEBUG2 localListenThread: Received event 3,14 SYNC
> 2007-02-01 15:36:19 CET ERROR  slon_connectdb: PQconnectdb("dbname=isp 
> host=localhost port=5432 user=pgsql") f
> ailed - could not connect to server: Connection refused
>          Is the server running on host "localhost" and accepting
>          TCP/IP connections on port 5432?
> 2007-02-01 15:36:19 CET WARN   remoteListenThread_1: DB connection 
> failed - sleep 10 seconds
> 2007-02-01 15:36:19 CET DEBUG2 remoteListenThread_2: LISTEN
> 
> 
> 
> 
> 
> 
> 
> id databases in namespace _test
> 
> on replica:
>   SELECT * from _test.sl_node
> isp-# ;
>   no_id | no_active | no_comment | no_spool
> -------+-----------+------------+----------
>       1 | t         | MASTER     | f
>       2 | t         | SLAVE      | f
>       3 | t         | REPLICA    | f
> 
> 
> on slave:
> 
> isp=# SELECT * from _test.sl_node
> isp-# ;
>   no_id | no_active | no_comment | no_spool
> -------+-----------+------------+----------
>       2 | t         | SLAVE      | f
>       3 | t         | REPLICA    | f
> (2 rows)
> 
> 
> It looks like failover comment isn't properly propagated to replica.
> I was tested a lot of scenarios, but all failed...
> 
> Thanks for help.
> AK
> _______________________________________________
> Slony1-general mailing list
> [email protected]
> http://gborg.postgresql.org/mailman/listinfo/slony1-general


-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== [EMAIL PROTECTED] #
_______________________________________________
Slony1-general mailing list
[email protected]
http://gborg.postgresql.org/mailman/listinfo/slony1-general

Reply via email to