Re: [Linux-HA] 3 node cluster keeps failing after domU image is started

Greg Woods Sun, 27 Jun 2010 07:34:52 -0700

On Sun, 2010-06-27 at 03:02 -0700, Joe Shang wrote:

> Failed actions:
>     drbd_xen2:1_start_0 (node=xen1.box.com, call=10, rc=5,
> status=complete): not installed


This is one of the things that I don't like about heartbeat/pacemaker. A
minor error (misconfiguring a single resource) can cause major problems
(like a stonith death match that brings down the entire cluster).

One thing I have seen with Xen VMs is that the default timeouts are too
short. That may not be your particular problem, but you probably need to
increase them anyway. This is an example of what I have:



primitive VM-ldap ocf:heartbeat:Xen \
        params xmfile="/etc/xen/ldap" \
        op monitor interval="10" timeout="120" depth="0"
target-role="Stopped" \
        op start interval="0" timeout="60s" \
        op stop interval="0" timeout="120s" \
        meta is-managed="true" target-role="Started"

Before I added the explicit "op start" and "op stop" timeouts, I woulod
get failed stop or start operations and any attempt to fail over would
start a death match.

--Greg


_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Re: [Linux-HA] 3 node cluster keeps failing after domU image is started

Reply via email to