Hi, It looks to me like ocf:heartbeat:ManageVE might do the wrong thing in a couple of places, so I'd thought I'd check.
The resource agent manages openvz containers (i.e. lightweight virtual machines AKA "VEs", think chroot++). The principle potential problem is the stop operation: The metadata for the resource says: <action name="stop" timeout="75" /> ... however the stop action does this: $VZCTL stop $VEID >& /dev/null retcode=$? if [[ $retcode != 0 ]]; then ocf_log err "vzctl stop $VEID returned: $retcode" return $OCF_ERR_GENERIC fi return $OCF_SUCCESS When the "vzctl stop" operation is stopped, effectively a "shutdown -h now" command gets run within the container, and the container's init process attempt to shut down the virtual machines. If this hasn't happened within a certain amount of time (currently hard-coded within vzctl to be 120 seconds), e.g. because a service has hung during shutdown, or the system is under high load, then the vzctl command gets more aggressive and forcibly terminates all of the processes within the container. Whether the shutdown is "normal" or "forced", the vzctl return code is still 0 once the container has stopped. The upshot of this is that on an unloaded system (circa 2GHz Intel core2), the vzctl command has a max execution time of approx 122 seconds. Given the warning at the bottom of: http://www.linux-ha.org/doc/dev-guides/_literal_stop_literal_action.html It seems to me like <action name="stop" timeout="75" /> (since it effectively acts as the default stop timeout in pacemaker), is a bit reckless given the behaviour of "vzctl stop" and it should probably be 120s + some (e.g. 150s). BTW as far as I can see vzctl stop has behaved like this for a while. A more flexible solution might be to make the timeout configurable, but in the absence of this, then I think upping the stop action timeout seems like the right thing to do. The second potential problem (the correct behaviour here is a bit less clear to me) is with the 'start' command. It currently starts containers asynchronously (i.e. using "vzctl start" instead of "vzctl start --wait"), and then returns immediately. The monitor operation then immediately starts declaring the container to be started, i.e return OCF_SUCCESS whilst it still starting up, as well as once it has fully started. This always-async behaviour effectively defeats any attempt to use something like batch-limit to throttle simultaneous startup of nodes (which can obviously be pretty heavy-weight operations - e.g. especially when the VEs are relatively fat (for instance, simultaneously starting up all the VEs on one the nodes which I have in mind will make the load average hit > 100, and could result in the node being fenced for being unresponsive - this would then make the same thing happen on the other node - splat). I'm thinking that ocf:heartbeat:ManageVE should probably default to the "vzctl start --wait" case. I'm not sure if the timeout for start should be raised tho' (maybe it should even be lowered as http://www.linux-ha.org/doc/dev-guides/_metadata.html states "This is a hint to the user what minimal timeout should be configured for the action."). Any feedback welcome! Tim. -- South East Open Source Solutions Limited Registered in England and Wales with company number 06134732. Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org