On Tue, Jun 21, 2011 at 07:37:22AM +0200, Kulovits Christian - OS ITSC wrote: > Hi Dejan, > We have sybase at our shop, and the start of the Sybase server may last from > 5 minutes to up to 45 minutes. I found a resource agent in the web who needs > 3 timeout parameter passed to it, one for start, one for stop and one for > monitor.
I guess that you know you have to make sure that the resource agent is correctly implemented. There's also ocf-tester to help with testing. > And the cluster config itself has similar timeout values set for start, stop > and monitor activity in the metadata for the defined resource primitive. > Back to the Sybase server. I tried to change this RA in a way to remove the > redundant timeout parameters, run the start until the resources start-timeout > has elapsed, set the resource itself to unmanaged with > crm_resource --meta -t primitive -r $OCF_RESOURCE_INSTANCE -p is-managed -v > false > and return with rc=0 to leave the starting Sybase running. But for this part > of the code running after the SIGTERM there are only 5 seconds to live. > > The reason to do so is because after the Sybase startup has timed out the > cluster itself will stop the Sybase resource, and this will terminate the > startup process and we have to run the long lasting startup again. Another > way would be to get the meta data for the resource primitive passed to the > resource agent. But I found no way to get it till now. > Another way is to set the timeout to a very very high value, but I think this > is not a very good idea. Why not? That's the only thing you can do actually. Note that only if the resource may hang the shorter timeout may help. Thanks, Dejan > > Regards, Christian > > -----Original Message----- > From: Dejan Muhamedagic [mailto:deja...@fastmail.fm] > Sent: Montag, 20. Juni 2011 16:18 > To: The Pacemaker cluster resource manager > Subject: Re: [Pacemaker] Resource Agent timeout > > Hi, > > On Mon, Jun 20, 2011 at 03:15:23PM +0200, Kulovits Christian - OS ITSC wrote: > > Andreas, > > you mean the cluster wide default timeout? I wonder if there is a > > possibility to set the fixed timeout of 5 secs when SIGKILL is issued after > > the SIGTERM when the resource timeout is exceeded. > > No, it's not configurable. > > What's the use case? You should prevent actions from timing out in the first > place. > > Thanks, > > Dejan > > > Regards, > > Christian > > > > -----Original Message----- > > From: Andreas Kurz [mailto:andreas.k...@linbit.com] > > Sent: Montag, 20. Juni 2011 15:08 > > To: pacemaker@oss.clusterlabs.org > > Subject: Re: [Pacemaker] Resource Agent timeout > > > > On 2011-06-20 14:28, Kulovits Christian - OS ITSC wrote: > > > Hello List, > > > > > > > > > > > > When a resource agent times out a SIGTERM is issued when the timeout > > > value has exceeded. When the resource agent will not terminate > > > within the next 5 seconds a SIGKILL is issued. Is there a way to > > > set this limit? May be to 30 secs or so? 5 seconds may often be > > > insufficient for a proper cleanup. > > > > > > > The default action timeout is 20s so you already "tuned" it ... you can set > > a global "default-action-timeout" or specify a timeout for each operation > > per resource. > > > > Regards, > > Andreas > > > > > > > > > > > > > > > > > Jun 20 10:51:04 mars lrmd: [2178]: info: RA output: > > > (res_TimeoutRA_Killroy:stop:stderr) + sleep 10 > > > > > > Jun 20 10:51:08 mars lrmd: [2178]: WARN: res_TimeoutRA_Killroy:stop > > > process (PID 24359) timed out (try 1). Killing with signal SIGTERM (15). > > > > > > Jun 20 10:51:08 mars lrmd: [2178]: info: RA output: > > > (res_TimeoutRA_Killroy:stop:stderr) Terminated > > > > > > Jun 20 10:51:08 mars lrmd: [2178]: info: RA output: > > > (res_TimeoutRA_Killroy:stop:stderr) ++ ha_debug 'DEBUG: Resource > > > (res_TimeoutRA_Killroy): Timeout during stop of res_TimeoutRA_Killroy' > > > > > > ++ sleep 10 > > > > > > > > > > > > Jun 20 10:51:13 mars lrmd: [2178]: WARN: res_TimeoutRA_Killroy:stop > > > process (PID 24359) timed out (try 2). Killing with signal SIGKILL (9). > > > > > > Jun 20 10:51:13 mars lrmd: [2178]: WARN: operation stop[94] on > > > ocf::TimeoutRA::res_TimeoutRA_Killroy for client 2181, its parameters: > > > CRM_meta_timeout=[5000] crm_feature_set=[3.0.1] CRM_meta_name=[start] : > > > pid [24359] timed out > > > > > > Jun 20 10:51:13 mars crmd: [2181]: ERROR: process_lrm_event: LRM > > > operation res_TimeoutRA_Killroy_stop_0 (94) Timed Out > > > (timeout=5000ms) > > > > > > Mit freundlichen Grüßen / with best regards Christian Kulovits > > > > > > ____________________________________________ > > > > > > Description: cid:497353613@17022010-1F5B *AUSTRIAN AIRLINES > > > Christian > > > Kulovits* *ITSC Central System & Database Services Senior IT System > > > Engineer* > > > > > > Head Office > > > Office Park 2, P.O. Box 100 > > > 1300 Vienna-Airport, Austria > > > > > > > > > > > > *(** *Phone: +43 (0)5 1766 11557 > > > *Ê** *Fax: +43 (0)5 1766 511557 > > > È* *Mobile: +43 (0)664 80111 11557 > > > * email: christian.kulov...@austrian.com > > > <mailto:christian.kulov...@austrian.com> > > > ý www: www.austrian.com <http://www.austrian.com/> > > > > > > ____________________________________________ > > > > > > > > > > > > ________________________________________________ > > > > > > Austrian Airlines AG, Office Park 2, P.O. Box 100, 1300 > > > Vienna-Airport, Austria, registered office: Vienna, registered with > > > Vienna Commercial Court under FN 111000k, DVR 0091740. This e-mail > > > is confidential and is subject to disclaimers. Details can be found at: > > > http://www.austrian.com/disclaimer. > > > > > > > > > > > > > > > > > > _______________________________________________ > > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > > > Project Home: http://www.clusterlabs.org Getting started: > > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > > Bugs: > > > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pace > > > ma > > > ker > > > > > > > > _______________________________________________ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org Getting started: > > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: > > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacema > > ker > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker