Hello list,

I currently administer a two-node heartbeat (2.0.7) + DRBD (0.7.10) 
configuration with a single shared network, like so:
Host A: 172.16.57.3
Host B: 172.16.57.4
Cluster shared IP: 172.16.57.5

I want to add a crossover cable between the two hosts to avoid split brain and 
take DRBD traffic off the public network. This doesn't seem like it should be 
an unusual configuration, but I can't find any examples anywhere. Advice on 
the best way to architect the thing would be appreciated.

Considerations:
- Under normal circumstances, intracluster communication should go over the 
crossover cable.
- Cutting the crossover cable should not break anything.
- When failover takes place, it interrupts all existing client connections. So 
if either host is disconnected from the public network, failover should /not/ 
take place (instead, traffic should be routed through the other machine over 
the crossover cable.
- Ideally, if any host is completely isolated (no MII on either interface), it 
should shutdown/not start services.
- Ideally, the cluster would give priority to the fully-connected host if 
services are being started for the first time.

The architecture I'm currently considering is something like the following:
Host A, interface eth0: 172.16.57.3/24 (office net)
Host A, interface eth1: 172.16.58.1/30 (crossover net)
Host A, interface lo : 172.16.59.1/32
Host B, interface eth0: 172.16.57.4/24 (office net)
Host B, interface eth1: 172.16.58.2/30 (crossover net)
Host B, interface lo : 172.16.59.2/32
Cluster shared IP is 172.16.57.5/32 on interface lo:0

Both hosts would run RIP on all interfaces, and both heartbeat and DRBD would 
refer to the loopback address on the 172.16.59.0 network. This neatly solves 
the routing problem. Giving the crossover connection a slightly lower RIP 
metric ensures that intracluster communications stay there in normal 
operation. The Linux kernel stack seems OK with it, but I don't know if RIP 
will get confused by the overlapping networks (172.16.57.0/24 and 
172.16.57.5/32).

The main drawback to this approach is getting traffic from the clients to the 
proper host. You can't simply let heartbeat manage the shared IP via the 
IPAddr script, since the primary host may not be the one connected to the 
network. Instead, I guess you could have a script that checks if public 
network access is routed thru the other host. In that case it could set an 
attribute on that host via cibadmin, and heartbeat would add an proxy-arp 
entry in the opposing host's ARP table.

IOW, you run a script that looks something like this:
if grep 'eth1\t003910AC' /proc/net/route; then
   $has_direct_net=false;
else
   $has_direct_net=true;
fi
cibadmin -U -o nodes -p <<EOF
  <node uname="$my_hostname">
    <instance_attributes>
      <attributes>
        <nvpair id="$my_hostname_has_direct_net_connection"
          name="has_direct_net_connection" value="$has_direct_net">
      </attributes>
    </instance_attributes>
  </node>
EOF

To set up the CIB you would do something like this:
<resources>
  ...
  <primitive id="ProxyArp" class="ocf" type="ProxyArp" provider="heartbeat">
    <instance_attributes>
        <attributes>
            <nvpair id="common_ip" name="ip" value="172.16.57.5"/>
        </attributes>
     </instance_attributes>
  </primitive>
</resources>
<constraints>
  ...
  <!-- Only run where there is no direct net connection. -->
  <rsc_location id="proxy_arp_network_constraint" rsc="ProxyArp">
     <rule id="proxy_arp_network_constraint_rule" score="-INFINITY">
        <expression id="proxy_arp_network_constaint_rule_expression"
         attribute="has_direct_net_connection" operation="eq"
         value="true">
     </rule>
  </rsc_location>
  <!-- Only run where the services themselves are not running. -->
  <rsc_colocation id="proxy_arp_service_constraint"
     from="ProxyArp" to="service_group" score="-INFINITY">
</constraints>

Any and all feedback on this issue is highly appreciated.

Thanks,

--Ian Turner
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to