Re: [Pacemaker] A couple of queries regarding the behaviour of ocf:heartbeat:ManageVE

2013-03-11 Thread Dejan Muhamedagic
Hi,

On Sat, Mar 09, 2013 at 05:29:11PM +, Tim Small wrote:
 Hi,
 
 It looks to me like ocf:heartbeat:ManageVE might do the wrong thing in a
 couple of places, so I'd thought I'd check.
 
 The resource agent manages openvz containers (i.e. lightweight virtual
 machines AKA VEs, think chroot++).
 
 
 
 The principle potential problem is the stop operation:
 
 The metadata for the resource says:
 
 action name=stop timeout=75 /
 
 ... however the stop action does this:
 
 $VZCTL stop $VEID  /dev/null
   retcode=$?
 
   if [[ $retcode != 0 ]]; then
 ocf_log err vzctl stop $VEID returned: $retcode
 return $OCF_ERR_GENERIC
   fi
 
   return $OCF_SUCCESS
 
 
 
 When the vzctl stop operation is stopped, effectively a shutdown -h
 now command gets run within the container, and the container's init
 process attempt to shut down the virtual machines.  If this hasn't
 happened within a certain amount of time (currently hard-coded within
 vzctl to be 120 seconds), e.g. because a service has hung during
 shutdown, or the system is under high load, then the vzctl command gets
 more aggressive and forcibly terminates all of the processes within the
 container.  Whether the shutdown is normal or forced, the vzctl
 return code is still 0 once the container has stopped.  The upshot of
 this is that on an unloaded system (circa 2GHz Intel core2), the vzctl
 command has a max execution time of approx 122 seconds.
 
 Given the warning at the bottom of:
 
 http://www.linux-ha.org/doc/dev-guides/_literal_stop_literal_action.html
 
 It seems to me like action name=stop timeout=75 /  (since it
 effectively acts as the default stop timeout in pacemaker), is a bit
 reckless given the behaviour of vzctl stop and it should probably be
 120s + some (e.g. 150s).  BTW as far as I can see vzctl stop has behaved
 like this for a while.
 
 A more flexible solution might be to make the timeout configurable, but
 in the absence of this, then I think upping the stop action timeout
 seems like the right thing to do.

The value you found is just an advice to the user. You can
define timeout for any operation on a per-resource basis (.e.g.
op stop timeout=4m).

 The second potential problem (the correct behaviour here is a bit less
 clear to me) is with the 'start' command.  It currently starts
 containers asynchronously (i.e. using vzctl start  instead of vzctl
 start --wait), and then returns immediately.  The monitor operation
 then immediately starts declaring the container to be started, i.e
 return OCF_SUCCESS whilst it still starting up, as well as once it has
 fully started.
 
 This always-async behaviour effectively defeats any attempt to use

It doesn't really. All resource operations are run in background
anyway (by lrmd).

 something like batch-limit to throttle simultaneous startup of nodes
 (which can obviously be pretty heavy-weight operations - e.g. especially
 when the VEs are relatively fat (for instance, simultaneously starting
 up all the VEs on one the nodes which I have in mind will make the load
 average hit  100, and could result in the node being fenced for being
 unresponsive - this would then make the same thing happen on the other
 node - splat).
 
 I'm thinking that ocf:heartbeat:ManageVE should probably default to the
 vzctl start --wait case.  I'm not sure if the timeout for start should
 be raised tho' (maybe it should even be lowered as
 http://www.linux-ha.org/doc/dev-guides/_metadata.html states This is a
 hint to the user what minimal timeout should be configured for the
 action.).

Yes, that's also what I wanted to say a bit earlier.

The start operation should anyway wait until the resource is
completely started, so it should do 'start --wait.'

Thanks,

Dejan

 Any feedback welcome!
 
 Tim.
 
 -- 
 South East Open Source Solutions Limited
 Registered in England and Wales with company number 06134732.  
 Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
 VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309
 
 
 ___
 Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
 http://oss.clusterlabs.org/mailman/listinfo/pacemaker
 
 Project Home: http://www.clusterlabs.org
 Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
 Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] A couple of queries regarding the behaviour of ocf:heartbeat:ManageVE

2013-03-11 Thread Tim Small
On 11/03/13 15:12, Dejan Muhamedagic wrote:
 A more flexible solution might be to make the timeout configurable, but
  in the absence of this, then I think upping the stop action timeout
  seems like the right thing to do.
 
 The value you found is just an advice to the user. You can
 define timeout for any operation on a per-resource basis (.e.g.
 op stop timeout=4m)

Sorry - I didn't make myself clear - what I meant is that the vzctl
program (which ocf:heartbeat:ManageVE uses extensively), should be
modified so that its internal timeout (after which it forcibly stops the
VM in question - currently hard-coded to 120s) is modifiable.  The
ocf:heartbeat:ManageVE could then support setting this timeout (and
complain loudly, if it's been set too close to, or greater-than the
resource stop operation timeout value).

As it is, I think the ocf:heartbeat:ManageVE stop timeout should always
be greater than the underlying vzctl timeout, otherwise the virtual
machine (VE / container / zone, whatever you want to call it) stop
operation will be unreliable, and the cluster node runs the risk of
going splat.

I realise that the ocf:heartbeat:ManageVE stop operation timeout is
advice to the user, but it seems like pretty bad advise at the moment!

 The start operation should anyway wait until the resource is
 completely started, so it should do 'start --wait.'
   

OK, good - that's what I though I'll submit a patch for the resource
script for the time being, and look at doing the modification to vzctl
asynchronously I think...

Tim.

-- 
South East Open Source Solutions Limited
Registered in England and Wales with company number 06134732.  
Registered Office: 2 Powell Gardens, Redhill, Surrey, RH1 1TQ
VAT number: 900 6633 53  http://seoss.co.uk/ +44-(0)1273-808309


___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org