RE: [Slony1-general] Failover and Failback

Lawrence Giam Thu, 01 Oct 2009 05:07:39 -0700

Hi,


Based on your answer to my question, I need to clarify a few things.

 

Filip's answer: ( slon deamons read their configuration from the cluster, not 
from /etc/slony.... config files; these are used only for initial 
configuration, and sysadmin has to keep them up to date manually )

 

When I was doing my failover testing, one thing I noticed is that the altperl 
script will use the slon_tools.conf if none was specified. In the 
slon_tools.conf file, there is this directive $MASTERNODE which define the 
Master of the cluster. Even when doing slonik_show_configuration or 
slonik_move_set, the config file is in fact use for referencing the cluster.

 

I have the setup N1 as Master (ID=1) and N2 as Slave (ID=2), with the 
$MASTERNODE set to 1, which mean N1 is the Master. In order to test my theory, 
I set up 2 config files, one with the $MASTERNODE=1 (slon_tools.conf) and the 
other with $MASTERNODE=2 (slon_tools_failover.conf).

 

Sample of my slon_tools.conf :

---------------------------------------------

if ($ENV{"SLONYNODES"}) {

    require $ENV{"SLONYNODES"};

} else {

    $CLUSTER_NAME = 'testrepl';

    $LOGDIR = '/var/log/slony';

 

# $APACHE_ROTATOR = '/usr/local/apache/bin/rotatelogs';

# $SYNC_CHECK_INTERVAL = 1000;

 

$MASTERNODE = 1; 

  add_node(node => 1, host => 'db01', dbname => 'testdb', port => 5432, user => 
'postgres', password => 'xxxxxxxx');

  add_node(node => 2, host => 'db02', dbname => 'testdb', port => 5432, user => 
'postgres', password => 'xxxxxxxx');

}

 

$SLONY_SETS = { 

      "set1" => { 

             "set_id" => 1, 

             # "origin" => 1,

             # foldCase => 0,

             "table_id" => 1, 

             "sequence_id" => 1,

             "pkeyedtables" => ["contact"], 

},

}; 

if ($ENV{"SLONYSET"}) { 

require $ENV{"SLONYSET"};

}

 

# Please do not add or change anything below this point.

1;

---------------------------------------------

 

Sample of my slon_tools_failover.conf :

---------------------------------------------

if ($ENV{"SLONYNODES"}) {

    require $ENV{"SLONYNODES"};

} else {

    $CLUSTER_NAME = 'testrepl';

    $LOGDIR = '/var/log/slony';

 

# $APACHE_ROTATOR = '/usr/local/apache/bin/rotatelogs';

# $SYNC_CHECK_INTERVAL = 1000;

 

$MASTERNODE = 2; 

  add_node(node => 1, host => 'db01', dbname => 'testdb', port => 5432, user => 
'postgres', password => 'xxxxxxxx');

  add_node(node => 2, host => 'db02', dbname => 'testdb', port => 5432, user => 
'postgres', password => 'xxxxxxxx');

}

 

$SLONY_SETS = { 

      "set1" => { 

             "set_id" => 1, 

             # "origin" => 1,

             # foldCase => 0,

             "table_id" => 1, 

             "sequence_id" => 1,

             "pkeyedtables" => ["contact"], 

},

}; 

if ($ENV{"SLONYSET"}) { 

require $ENV{"SLONYSET"};

}

 

# Please do not add or change anything below this point.

1;

---------------------------------------------

 

I did a failover with this command "slonik_failover 
-config=/usr/local/slony/etc/slon_tools.conf 1 2 | slonik" and it failover as 
per expected. Next I try to drop N1 from the cluster config using the following:

1. slonik_drop_node -config=/usr/local/slony/etc/slon_tools.conf 1 | slonik

Got error this error : <stdin>:4: Error: Node ID and event node cannot be 
identical

2. slonik_drop_node -config=/usr/local/slony/etc/slon_tools_failover.conf 1 | 
slonik

Got this meesage : <stdin>:10: dropped node 1 cluster

After that follow by the process of uninstall node on N1 from the cluster

 

I check in the PgAdmin and found that the cluster database on N1 was removed 
and operation was as if N1 was no longer there.

 

Next to confirm my theory, I add the failed node back to the cluster using the 
following steps:

1. Add N1 to cluster : slonik_store_node 
-config=/usr/local/slony/etc/slon_tools_failover.conf 1 | slonik

<stdin>:7: Set up replication nodes
<stdin>:10: Next: configure paths for each node/origin
<stdin>:13: Replication nodes prepared
<stdin>:14: Please start a slon replication daemon for each node

2. Start slon daemon on N1 : slon_start 
-config=/usr/local/slony/etc/slon_tools_failover.conf 1

Invoke slon for node 1 - /usr/local/slony/bin/slon -s 1000 -d2 testrepl 
'host=gitc-nix-db01 dbname=testdb user=postgres port=5432 password=xxxxxxxx' 
2>&1 > /var/log/slony/slony1/node1/testdb-2009-09-29_17:08:22.log &
Slon successfully started for cluster testrepl, node node1
PID [17087]
Start the watchdog process as well...

3. Subscribe N1 to cluster : slonik_subscriber_set 
-config=/usr/local/slony/etc/slon_tools_failover.conf 1 1 | slonik

# sudo /usr/local/slony/bin/slonik_subscribe_set 
--config=/usr/local/slony/etc/slon_tools_failover.conf 1 1 | 
/usr/local/slony/bin/slonik
<stdin>:10: Subscribed nodes to set 1

4. Check operation and replication is working.

 

Can someone verify that what I am doing is correct?

 

If yes, does that mean that the altperl script does not query the cluster 
database for status of the Masternode and instead uses the slon_tools.conf for 
referencing the cluster?

 

If the answer to all my questions is correct, I would like the author of Slony 
to include this into the Slony documentation so as to clarify the failover 
procedures.

Regards,

Lawrence Giam

  <http://www.globalitcreations.com> 
..................................................................................................
Lawrence Giam | Global IT Creations Pte Ltd |  Network Administrator  
website: http://www.globalitcreations.com
phone: +65 6836 4768 ext 115| fax: + 65 6836 4736 | mobile: + 65 9758 7448 

-----Original Message-----
From: Filip Rembialkowski [mailto:[email protected]] 
Sent: Wednesday, 30 September 2009 12:12 AM
To: Lawrence Giam
Cc: [email protected]
Subject: Re: [Slony1-general] Failover and Failback

 

 

W dniu 29 września 2009 03:26 użytkownik Lawrence Giam 
<[email protected]> napisał:

Hi

 

Thanks for the advice.

 

With regard to the question on moving back the master role back to N1 after 
fully synced with N2. Does the move_set command pass the Master role back 
completely?

yes. ( with regards to this replication set. you can have more replication 
sets, each with its own origin )
 

         

        Assuming the role are switched back to the initial setup stage, meaning 
N1 is the Master and N2 is the Slave. Assuming if we have to stop both servers 
due to unseen event and at the point of starting the slon daemon on both 
servers, 

        1.      is it the same as before (eg. N1 - slon_start 1,  N2 - 
slon_start 2)?

yes. why should it be different? 

        1.      will the master role still be with N1?

yes.


( slon deamons read their configuration from the cluster, not from 
/etc/slony.... config files; these are used only for initial configuration, and 
sysadmin has to keep them up to date manually )





-- 
Filip Rembiałkowski
JID,mailto:[email protected]
http://filip.rembialkowski.net/

<<image001.jpg>>

_______________________________________________
Slony1-general mailing list
[email protected]
http://lists.slony.info/mailman/listinfo/slony1-general

RE: [Slony1-general] Failover and Failback

Reply via email to