On Tue, Jun 21, 2011 at 07:37:22AM +0200, Kulovits Christian - OS ITSC wrote:
> Hi Dejan,
> We have sybase at our shop, and the start of the Sybase server may last from 
> 5 minutes to up to 45 minutes. I found a resource agent in the web who needs 
> 3 timeout parameter passed to it, one for start, one for stop and one for 
> monitor.

I guess that you know you have to make sure that the resource
agent is correctly implemented. There's also ocf-tester to help
with testing.

> And the cluster config itself has similar timeout values set for start, stop 
> and monitor activity in the metadata for the defined resource primitive. 
> Back to the Sybase server. I tried to change this RA in a way to remove the 
> redundant timeout parameters, run the start until the resources start-timeout 
> has elapsed, set the resource itself to unmanaged with
> crm_resource --meta -t primitive -r $OCF_RESOURCE_INSTANCE -p is-managed -v 
> false
> and return with rc=0 to leave the starting Sybase running. But for this part 
> of the code running after the SIGTERM there are only 5 seconds to live.
> 
> The reason to do so is because after the Sybase startup has timed out the 
> cluster itself will stop the Sybase resource, and this will terminate the 
> startup process and we have to run the long lasting startup again. Another 
> way would be to get the meta data for the resource primitive passed to the 
> resource agent. But I found no way to get it till now.
> Another way is to set the timeout to a very very high value, but I think this 
> is not a very good idea.

Why not? That's the only thing you can do actually. Note that
only if the resource may hang the shorter timeout may help.

Thanks,

Dejan

> 
> Regards, Christian
> 
> -----Original Message-----
> From: Dejan Muhamedagic [mailto:deja...@fastmail.fm] 
> Sent: Montag, 20. Juni 2011 16:18
> To: The Pacemaker cluster resource manager
> Subject: Re: [Pacemaker] Resource Agent timeout
> 
> Hi,
> 
> On Mon, Jun 20, 2011 at 03:15:23PM +0200, Kulovits Christian - OS ITSC wrote:
> > Andreas,
> > you mean the cluster wide default timeout? I wonder if there is a 
> > possibility to set the fixed timeout of 5 secs when SIGKILL is issued after 
> > the SIGTERM when the resource timeout is exceeded.
> 
> No, it's not configurable. 
> 
> What's the use case? You should prevent actions from timing out in the first 
> place.
> 
> Thanks,
> 
> Dejan
> 
> > Regards,
> > Christian
> > 
> > -----Original Message-----
> > From: Andreas Kurz [mailto:andreas.k...@linbit.com]
> > Sent: Montag, 20. Juni 2011 15:08
> > To: pacemaker@oss.clusterlabs.org
> > Subject: Re: [Pacemaker] Resource Agent timeout
> > 
> > On 2011-06-20 14:28, Kulovits Christian - OS ITSC wrote:
> > > Hello List,
> > > 
> > >  
> > > 
> > > When a resource agent times out a SIGTERM is issued when the timeout 
> > > value has exceeded. When the resource agent will not terminate 
> > > within the next  5 seconds a SIGKILL is issued. Is there a way to 
> > > set this limit? May be to 30 secs or so? 5 seconds may often be 
> > > insufficient for a proper cleanup.
> > > 
> > 
> > The default action timeout is 20s so you already "tuned" it ... you can set 
> > a global "default-action-timeout" or specify a timeout for each operation 
> > per resource.
> > 
> > Regards,
> > Andreas
> > 
> > >  
> > > 
> > >  
> > > 
> > > Jun 20 10:51:04 mars lrmd: [2178]: info: RA output:
> > > (res_TimeoutRA_Killroy:stop:stderr) + sleep 10
> > > 
> > > Jun 20 10:51:08 mars lrmd: [2178]: WARN: res_TimeoutRA_Killroy:stop 
> > > process (PID 24359) timed out (try 1).  Killing with signal SIGTERM (15).
> > > 
> > > Jun 20 10:51:08 mars lrmd: [2178]: info: RA output:
> > > (res_TimeoutRA_Killroy:stop:stderr) Terminated
> > > 
> > > Jun 20 10:51:08 mars lrmd: [2178]: info: RA output:
> > > (res_TimeoutRA_Killroy:stop:stderr) ++ ha_debug 'DEBUG: Resource
> > > (res_TimeoutRA_Killroy): Timeout during stop of res_TimeoutRA_Killroy'
> > > 
> > > ++ sleep 10
> > > 
> > >  
> > > 
> > > Jun 20 10:51:13 mars lrmd: [2178]: WARN: res_TimeoutRA_Killroy:stop 
> > > process (PID 24359) timed out (try 2).  Killing with signal SIGKILL (9).
> > > 
> > > Jun 20 10:51:13 mars lrmd: [2178]: WARN: operation stop[94] on 
> > > ocf::TimeoutRA::res_TimeoutRA_Killroy for client 2181, its parameters:
> > > CRM_meta_timeout=[5000] crm_feature_set=[3.0.1] CRM_meta_name=[start] :
> > > pid [24359] timed out
> > > 
> > > Jun 20 10:51:13 mars crmd: [2181]: ERROR: process_lrm_event: LRM 
> > > operation res_TimeoutRA_Killroy_stop_0 (94) Timed Out 
> > > (timeout=5000ms)
> > > 
> > > Mit freundlichen Grüßen / with best regards Christian Kulovits
> > > 
> > > ____________________________________________
> > > 
> > > Description: cid:497353613@17022010-1F5B *AUSTRIAN AIRLINES 
> > > Christian
> > > Kulovits* *ITSC Central System & Database Services Senior IT System
> > > Engineer*
> > > 
> > > Head Office
> > > Office Park 2, P.O. Box 100
> > > 1300 Vienna-Airport, Austria
> > > 
> > >  
> > > 
> > > *(**   *Phone:     +43 (0)5 1766   11557
> > > *Ê**   *Fax:         +43 (0)5 1766 511557
> > > È*   *Mobile:     +43 (0)664 80111 11557
> > > *   email:      christian.kulov...@austrian.com
> > > <mailto:christian.kulov...@austrian.com>
> > > ý   www:       www.austrian.com <http://www.austrian.com/>
> > > 
> > > ____________________________________________
> > > 
> > >  
> > > 
> > > ________________________________________________
> > > 
> > > Austrian Airlines AG, Office Park 2, P.O. Box 100, 1300 
> > > Vienna-Airport, Austria, registered office: Vienna, registered with 
> > > Vienna Commercial Court under FN 111000k, DVR 0091740. This e-mail 
> > > is confidential and is subject to disclaimers. Details can be found at:
> > > http://www.austrian.com/disclaimer.
> > > 
> > >  
> > > 
> > > 
> > > 
> > > _______________________________________________
> > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
> > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > > 
> > > Project Home: http://www.clusterlabs.org Getting started: 
> > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > > Bugs: 
> > > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pace
> > > ma
> > > ker
> > 
> > 
> > 
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> > 
> > Project Home: http://www.clusterlabs.org Getting started: 
> > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: 
> > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacema
> > ker
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org 
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org Getting started: 
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
> 
> _______________________________________________
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

_______________________________________________
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker

Reply via email to