Hi list!

I've got a question about failover conditions for my two-node
Heartbeat/DRBD/NFS system.  I've already searched the list archives and
can't seem to find a definitive answer to my question.

We're using Heartbeat V1, we're not using Stonith.  We get around
split-brain recovery by bringing up all the services in an "off" state and
manually turning every thing back on.  Our failure methodology is tolerant
of simply making sure the initial failover is automatic, and the rest of the
work can be done by meatware.

The configuration seems to work fine, and we can successfully fail over with
disaster simulation or simply shutting down heartbeat.  To clarify-  we can
pull the plug on the active unit, and the secondary takes over with no
problem.

The problem is this:  we've had a couple failure conditions where NFS became
unavailable but the server was still network-visible and heartbeat did not
register an outage.

Here's my question-  is there a way to make Heartbeat V1 do service tests
instead of pinging to determine system health?  Do I have to go to V2 and
CRM?  

Here's some configs.  Let me know If there's more I can provide that will
help.

Thanks in advance!

Bond0 is the network serving up the NFS data
Bond1 is the network DRBD syncs over.
.60 and .110 are node1
.61 and .111 are node2
________________
deadtime 15
keepalive 5
warntime 6
logfacility local6

ucast bond0 192.168.101.60
ucast bond1 10.143.254.110
ucast bond0 192.168.101.61
ucast bond1 10.143.254.111

debug 1
auto_failback off
node node1.dmz.domain.local
node node2.dmz.domain.local

_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to