Hello list,
I currently administer a two-node heartbeat (2.0.7) + DRBD (0.7.10)
configuration with a single shared network, like so:
Host A: 172.16.57.3
Host B: 172.16.57.4
Cluster shared IP: 172.16.57.5
I want to add a crossover cable between the two hosts to avoid split brain and
take DRBD traffic off the public network. This doesn't seem like it should be
an unusual configuration, but I can't find any examples anywhere. Advice on
the best way to architect the thing would be appreciated.
Considerations:
- Under normal circumstances, intracluster communication should go over the
crossover cable.
- Cutting the crossover cable should not break anything.
- When failover takes place, it interrupts all existing client connections. So
if either host is disconnected from the public network, failover should /not/
take place (instead, traffic should be routed through the other machine over
the crossover cable.
- Ideally, if any host is completely isolated (no MII on either interface), it
should shutdown/not start services.
- Ideally, the cluster would give priority to the fully-connected host if
services are being started for the first time.
The architecture I'm currently considering is something like the following:
Host A, interface eth0: 172.16.57.3/24 (office net)
Host A, interface eth1: 172.16.58.1/30 (crossover net)
Host A, interface lo : 172.16.59.1/32
Host B, interface eth0: 172.16.57.4/24 (office net)
Host B, interface eth1: 172.16.58.2/30 (crossover net)
Host B, interface lo : 172.16.59.2/32
Cluster shared IP is 172.16.57.5/32 on interface lo:0
Both hosts would run RIP on all interfaces, and both heartbeat and DRBD would
refer to the loopback address on the 172.16.59.0 network. This neatly solves
the routing problem. Giving the crossover connection a slightly lower RIP
metric ensures that intracluster communications stay there in normal
operation. The Linux kernel stack seems OK with it, but I don't know if RIP
will get confused by the overlapping networks (172.16.57.0/24 and
172.16.57.5/32).
The main drawback to this approach is getting traffic from the clients to the
proper host. You can't simply let heartbeat manage the shared IP via the
IPAddr script, since the primary host may not be the one connected to the
network. Instead, I guess you could have a script that checks if public
network access is routed thru the other host. In that case it could set an
attribute on that host via cibadmin, and heartbeat would add an proxy-arp
entry in the opposing host's ARP table.
IOW, you run a script that looks something like this:
if grep 'eth1\t003910AC' /proc/net/route; then
$has_direct_net=false;
else
$has_direct_net=true;
fi
cibadmin -U -o nodes -p <<EOF
<node uname="$my_hostname">
<instance_attributes>
<attributes>
<nvpair id="$my_hostname_has_direct_net_connection"
name="has_direct_net_connection" value="$has_direct_net">
</attributes>
</instance_attributes>
</node>
EOF
To set up the CIB you would do something like this:
<resources>
...
<primitive id="ProxyArp" class="ocf" type="ProxyArp" provider="heartbeat">
<instance_attributes>
<attributes>
<nvpair id="common_ip" name="ip" value="172.16.57.5"/>
</attributes>
</instance_attributes>
</primitive>
</resources>
<constraints>
...
<!-- Only run where there is no direct net connection. -->
<rsc_location id="proxy_arp_network_constraint" rsc="ProxyArp">
<rule id="proxy_arp_network_constraint_rule" score="-INFINITY">
<expression id="proxy_arp_network_constaint_rule_expression"
attribute="has_direct_net_connection" operation="eq"
value="true">
</rule>
</rsc_location>
<!-- Only run where the services themselves are not running. -->
<rsc_colocation id="proxy_arp_service_constraint"
from="ProxyArp" to="service_group" score="-INFINITY">
</constraints>
Any and all feedback on this issue is highly appreciated.
Thanks,
--Ian Turner
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems