Hi list!

I've got a question about failover conditions for my two-node
Heartbeat/DRBD/NFS system.  I've already searched the list archives and
can't seem to find a definitive answer to my question.

We're using Heartbeat V1, we're not using Stonith.  We get around
split-brain recovery by bringing up all the services in an "off" state
and manually turning every thing back on.  Our failure methodology is
tolerant of simply making sure the initial failover is automatic, and
the rest of the work can be done by meatware.

The configuration seems to work fine, and we can successfully fail over
with disaster simulation or simply shutting down heartbeat.  To clarify-
we can pull the plug on the active unit, and the secondary takes over
with no problem.

The problem is this:  we've had a couple failure conditions where NFS
became unavailable but the server was still network-visible and
heartbeat did not register an outage.

Here's my question-  is there a way to make Heartbeat V1 do service
tests instead of pinging to determine system health?  Do I have to go to
V2 and CRM?  

Here's some configs.  Let me know If there's more I can provide that
will help.

Thanks in advance!

Bond0 is the network serving up the NFS data
Bond1 is the network DRBD syncs over.
.60 and .110 are node1
.61 and .111 are node2
________________
deadtime 15
keepalive 5
warntime 6
logfacility local6

ucast bond0 192.168.101.60
ucast bond1 10.143.254.110
ucast bond0 192.168.101.61
ucast bond1 10.143.254.111

debug 1
auto_failback off
node node1.dmz.domain.local
node node2.dmz.domain.local
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to