Slony Cluster?

Dominik Klein Tue, 06 Nov 2007 07:02:07 -0800

Hi

a week earlier I asked wether there was a resource agent that implementsMaster/Slave for a Postgres Cluster using slony-1 replication.


There was not, so I tried to implement it myself.

I want to report back to give an explanation and reference on why Ithink it is not possible (at the moment) to implement this in heartbeat.


Here we go:

Short summary of slony-1 replication:
In a slony-1 replication setup
* Tables are put together to replication "sets"
* Each set has an "origin" (master)
* Only the origin can be written to
* There can be multiple sets with a different origin each
* There can be multiple "subscribers" (slaves) for each set
* Subscribers are read-only

As you have to somewhat connect the master role to the health ofpostgres itself, this restricts you to the use of only one set or manageall sets at once. Well, okay, I think I could live with this.

Slony-1 implements two commands for "switchover" and "failover". I meanSwitchover when I want to do a planned switch of roles when all machinesare healthy. I mean failover when the Master has a problem and the Slavetakes over.


So now comes the tricky part.

In slony-1 you cannot make an origin a subscriber without making anothersubscriber the new origin. This happens in ONE command. So there are noindependent "demote" and "promote" commands. In a two machine setup youcannot have two slaves at a time.

In other words: "Promote" implicitely demotes the other machine,"Demote" implicitely promotes the other machine.

So I thought I could implement "demote" as "return 0", as "promote" onthe other machine will do the job anyway. Well, not the best idea as a"monitor" action on the apparently demoted machine will still returnMaster Status until "promote" on the second machine finished.

Furthermore, the switchover command will fail if the other machine isnot responding. In case the current master really has a problem, all youcan do get a writeable database on the current slave is to use thefailover command. But Linux-HA only knows "promote" and "demote".


So I implemented some promote and demote the following way:

#### promote
if switchover_to_me
then
        return 0
else
        if ! switchover_to_me
        then
                failover_to_me
                return $?
        fi
fi
####

#### demote
switchover_to_other_machine
# dont care if this works as it cannot work if
# the other machine is not healthy
return 0
####

What you also need to know about slony-1 is the fact that you need toresync the COMPLETE data after a failover. In slony-1 it is not possibleto let a failed node rejoin the slony-Cluster (even if it was healthywhen the failover command was issued). It has to fetch ALL data from thenew master. So you want to avoid failover if it is not absolutely necessary.

Up to now I thought my RA could handle a few cases and it turns out:SOME it can handle (like master reboot or slave reboot or controlledswitchover). But a simple thing as killing postgres on the mastermachine causes a failover. Why?:


Say A is master, B is slave at this moment

1. monitor on A fails
2. Linux-HA executes demote on A
-> As you see above, this will work even if it does nothing
3. Linux-HA executes promote on B

-> This, as postgres on A is not running, will end up in a failover (seeabove)

This is pretty much it. If you have any ideas on how to improve this orif you also think that this is impossible with the current master/slaveimplementation in Linux-HA - please respond.

The whole "separately demote and promote" approach in Linux-HA seems tojust not fit the way slony-1 handles switchover and failover.

If you have any more questions (it can well be I forgot something), justask - I'll be happy to help improve Linux-HA.


Best regards
Dominik
_______________________________________________
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] Feedback: Master/Slave RA for Postgres / Slony Cluster?

Reply via email to