On Mon, Nov 30, 2009 at 9:19 PM, Frank DiMeo <frank.di...@bigbandnet.com> wrote: > I’m experimenting with startup sequence and co-location control, and think I > may have stumbled across a bug. > > > > I have two xml files that I use in my testing as my initial configuration of > a two node cluster. I start each node with no configuration, and then use > cibadmin to “source in” the xml file. Each file defines two resources as > well as a startup order and collocation definition. The only difference > between the two files is the syntax I use to specify the startup order. > > > > When I use the syntax: > > > > <rsc_order id="order-1" first="world1" then="world2" score="INFINITY" /> > > > > Everything works fine. I can put either of the two nodes into standby while > resources are running there, and the resources move to the other node as > expected. > > > > However, when I use the syntax: > > > > - <<rsc_order id="order-1">
You're missing a score. Without one it defaults to 0 (which means optional). However, IIRC, the 1.0.6 schema won't allow you to set a score there so you'll need to apply the following patch: http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/c8585629629c > > - < <resource_set id="order-1-set-1" sequential="true"> > > < <resource_ref id="world1" /> > > < <resource_ref id="world2" /> > > </resource_set> > > </rsc_order> > > > > > > Several bad things happen. First, the resources don’t move off the node > that is put into standby, even though the alternate node is running and able > to run the resources. Did you remove the other ordering constraint first? > Second, attempting to shut down openais on the node > running the resources after attempting a forced move (by putting the node > into standby) leaves both the lrmd and pengine processes running (but > children of process 1 (init), and the resources continue to run on the that > node even after openais is stopped. I suspect you've a faulty init script there. See other email. > I turned debug on in crmd and in the logs and recorded what happens when I > force standby, and I notice that using the first syntax causes > te_rsc_command to be executed to send a shut down message to the node where > the resources are running (which seems to work), while using the second > syntax causes te_pseudo_action to be called in approximately the same place > in the log, but no shutdown of resources happens (I can’t really tell what > this is supposed to be doing). Neither can I - you didnt attach the logs :-) _______________________________________________ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker