Hi, On Sun, Sep 30, 2007 at 05:37:10PM -0700, Kelly Byrd wrote: > I'm running 2.1.2 on two nodes. I want heartbeat to manage 22 VMware VMs > across the two nodes. In terms of heartbeat resources, each VM is: > - drbd ocf master-slave resource > - Filesystem ocf resource (XFS) > - VM ocf resource (my own ocf script) > > > I'm looking for advice on how to group these resources since the all > depend on each other. I'm testing with config very similar to the > DRBD/HowTov2 example at: http://www.linux-ha.org/DRBD/HowTov2. > > I have a drbd master/slave resource (ms-drbd1), and then a group > (group_vm1). The group contains and filesystem resource (vm1-fs) and my VM > (vm1-vm) resource. I have an rsc_order contraint saying group_vm1 should > only run where ms-drbd1 has been promoted. I also have a rsc_colocation > constraint saying group_vm1 follows ms-drbd1.
You mean the other way around: exchange rsc_order and rsc_colocation. > Finally I have a location > constaint saying ms-drbd1 prefers node1. > > When testing this with two VMs (add ms-drbd2 and group_vm2, prefering > node2), things don't always work out as planned. Sometimes, with only one > node running, if I "/etc/init.d/heartbeat start" on node1, ms-drbd1 and > group_vm1 will try to migrate over to node1, fail then return back to > node2. It's not clear to me what's failing. What do the logs say? > I feel like sometimes I end up > in a state where the drbd resource starts, but the filesystem doesn't and > therefore the VM resource doesn't. Maybe I need a delay betweeb resource > starts? You can insert a Delay resource between the two. However, a delay should not be needed. > Should I be grouping these differently? The config looks ok to me. > I'm going to be creating > 22 of these "group of three" resources, with three constraints for each. > Is there an easier set of XML to configure this? I want half to prefer one > node, and half to prefer the other. Finally, if both nodes are up and > group_vm1 failed to start on a node, will it retry later? Actually, that's > more important to me in the single node case as there is no other place > for the failed resource to live. This has been planned. Probably in the next release heartbeat will after a while forget about the failed start. For now, you will have to crm_resource -C. Thanks, Dejan > > > _______________________________________________ > Linux-HA mailing list > [email protected] > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
