On Tue, Mar 12, 2013 at 12:58:44PM +0000, Tim Small wrote: > The attached patch changes the behaviour of the OpenVZ virtual machine > cluster resource agent, so that: > > 1. The default resource stop timeout is greater than the hardcoded
Just for the record: where is this hardcoded actually? Is it also documented? > timeout in "vzctl stop" (after this time, vzctl forcibly stops the > virtual machine) (since failure to stop a resource can lead to the > cluster node being evicted from the cluster entirely - and this is > generally a BAD thing). Agreed. > 2. The start operation now waits for resource startup to complete i.e. > for the VE to "boot up" (so that the cluster manager can detect VEs > which are hanging on startup, and also throttle simultaneous startups, > so as not-to overburden the node in question). Since the start > operation now does a lot more, the default start operation timeout has > been increased. I'm not sure if we can introduce this just like that. It changes significantly the agent's behaviour. BTW, how does vzctl know when the VE is started? > 3. Backs off the default timeouts and intervals for various operations > to less aggressive values. Please make patches which are self-contained, but can be described in a succinct manner. If the description above matches the code modifications, then there should be three instead of one patch. Please continue the discussion at linux-ha-dev, that's where RA development discussions take place. Cheers, Dejan > > Cheers, > > Tim. > > > n.b. There is a bug in the Debian 6.0 (Squeeze) OpenVZ kernel such that > "vzctl start <VEID> --wait" hangs. The bug doesn't impact the > OpenVZ.org kernels (and hence won't impact Debian 7.0 Wheezy either). > > -- > South East Open Source Solutions Limited > Registered in England and Wales with company number 06134732. > Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ > VAT number: 900 6633 53 http://seoss.co.uk/ +44-(0)1273-808309 > > --- ManageVE.old 2010-10-22 05:54:50.000000000 +0000 > +++ ManageVE 2013-03-12 11:39:47.895102380 +0000 > @@ -26,12 +26,15 @@ > # > # > # Created 07. Sep 2006 > -# Updated 18. Sep 2006 > +# Updated 12. Mar 2013 > # > -# rev. 1.00.3 > +# rev. 1.00.4 > # > # Changelog > # > +# 12/Mar/13 1.00.4 Wait for VE startup to finish, lengthen default start > timeout. > +# Default stop timeout to longer than the vzctl stop > 'polite' > +# interval. > # 12/Sep/06 1.00.3 more cleanup > # 12/Sep/06 1.00.2 fixed some logic in start_ve > # general cleanup all over the place > @@ -67,7 +70,7 @@ > <?xml version="1.0"?> > <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> > <resource-agent name="ManageVE"> > - <version>1.00.3</version> > + <version>1.00.4</version> > > <longdesc lang="en"> > This OCF complaint resource agent manages OpenVZ VEs and thus requires > @@ -87,12 +90,12 @@ > </parameters> > > <actions> > - <action name="start" timeout="75" /> > - <action name="stop" timeout="75" /> > - <action name="status" depth="0" timeout="10" interval="10" /> > - <action name="monitor" depth="0" timeout="10" interval="10" /> > - <action name="validate-all" timeout="5" /> > - <action name="meta-data" timeout="5" /> > + <action name="start" timeout="240" /> > + <action name="stop" timeout="150" /> > + <action name="status" depth="0" timeout="20" interval="60" /> > + <action name="monitor" depth="0" timeout="20" interval="60" /> > + <action name="validate-all" timeout="10" /> > + <action name="meta-data" timeout="10" /> > </actions> > </resource-agent> > END > @@ -127,7 +130,7 @@ > return $retcode > fi > > - $VZCTL start $VEID >& /dev/null > + $VZCTL start $VEID --wait >& /dev/null > retcode=$? > > if [[ $retcode != 0 && $retcode != 32 ]]; then > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org