Re: [Pacemaker] metadata (timeout) ignored?
On Thu, Jan 21, 2010 at 11:00 AM, Dejan Muhamedagic wrote: > Also, if you use crm > shell it will print warnings in case the timeouts are smaller > than what's advised. Oh! Neat :-) ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] metadata (timeout) ignored?
Hi, On Thu, Jan 21, 2010 at 10:18:09AM +0100, Markus M. wrote: > Hello, > > Dejan Muhamedagic wrote: > > >> returning the value of 100 seconds for the stop action? Is there > >> another place to set the timeout for the stop action of this ra? > >Yes, in the cluster configuration. Like this: > > Thank you, i see, and it works now! > > This was really a RTFM question, sorry. But i wonder what is the > intention of the ocf resource agent "meta-data" action if the > returned output seems not to be used anywhere? It's minimum values advised by the author of the resource agent. Obviously they can't fit all resources. Also, if you use crm shell it will print warnings in case the timeouts are smaller than what's advised. Thanks, Dejan > With kind regards > Markus > > ___ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] metadata (timeout) ignored?
On Thu, Jan 21, 2010 at 10:18 AM, Markus M. wrote: > Hello, > > Dejan Muhamedagic wrote: > >>> returning the value of 100 seconds for the stop action? Is there >>> another place to set the timeout for the stop action of this ra? >>Yes, in the cluster configuration. Like this: > > Thank you, i see, and it works now! > > This was really a RTFM question, sorry. But i wonder what is the intention > of the ocf resource agent "meta-data" action if the returned output seems > not to be used anywhere? Hints to GUIs. Ie. they could in theory preset things like timeouts when you create a monitor operation ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] metadata (timeout) ignored?
Hello, Dejan Muhamedagic wrote: >> returning the value of 100 seconds for the stop action? Is there >> another place to set the timeout for the stop action of this ra? >Yes, in the cluster configuration. Like this: Thank you, i see, and it works now! This was really a RTFM question, sorry. But i wonder what is the intention of the ocf resource agent "meta-data" action if the returned output seems not to be used anywhere? With kind regards Markus ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] metadata (timeout) ignored?
Hi, On Wed, Jan 20, 2010 at 09:45:46PM +0100, Markus M. wrote: > Dejan Muhamedagic wrote: > > >>Operations' defaults (advisory minimum): > >> > >>stop timeout=100 > >> > >>So it seems for the "stop" action there is a timeout of 100 seconds > >>defined. But at cluster shutdown i can see this in the ha-debug log: > > > >It says above that it's "advisory minimum" (the wording should > >probably be changed). You have to set the timeouts yourself. > > Sorry, maybe i've misunderstood something... i thought _i've set the > timeout_ by making the ocf resource agent meta-data function > returning the value of 100 seconds for the stop action? Is there > another place to set the timeout for the stop action of this ra? Yes, in the cluster configuration. Like this: primitive rsc_c001n07 ocf:heartbeat:IPaddr \ params ip="127.0.0.16" cidr_netmask="32" \ op stop timeout="100s" Thanks, Dejan > The timeout is occuring after 20 seconds: > > >>Jan 18 14:31:35 node1 crmd: [12844]: info: te_rsc_command: > >>Initiating action 5: stop oracle_primary_stop_0 on node1 (local) > ... > >>Jan 18 14:31:55 node1 lrmd: [12841]: WARN: oracle_primary:stop > >>process (PID 14386) timed out (try 1). Killing with signal SIGTERM > >>(15). > > Regards > Markus > > ___ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] metadata (timeout) ignored?
Dejan Muhamedagic wrote: Operations' defaults (advisory minimum): >> stop timeout=100 So it seems for the "stop" action there is a timeout of 100 seconds defined. But at cluster shutdown i can see this in the ha-debug log: It says above that it's "advisory minimum" (the wording should probably be changed). You have to set the timeouts yourself. Sorry, maybe i've misunderstood something... i thought _i've set the timeout_ by making the ocf resource agent meta-data function returning the value of 100 seconds for the stop action? Is there another place to set the timeout for the stop action of this ra? The timeout is occuring after 20 seconds: Jan 18 14:31:35 node1 crmd: [12844]: info: te_rsc_command: Initiating action 5: stop oracle_primary_stop_0 on node1 (local) ... Jan 18 14:31:55 node1 lrmd: [12841]: WARN: oracle_primary:stop process (PID 14386) timed out (try 1). Killing with signal SIGTERM (15). Regards Markus ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
Re: [Pacemaker] metadata (timeout) ignored?
Hi, On Wed, Jan 20, 2010 at 04:28:49PM +0100, Markus M. wrote: > Hello, > > i've a question about metadata returned by an ocf resource agent > using the "meta-data" command and the behaviour of the cluster. > > When checking the resource agent's metadata using crm i get this: > > # crm > crm(live)# ra > crm(live)ra# meta cluster_oracle ocf > bla (ocf:heartbeat:cluster_oracle) > > Master/Slave OCF Resource Agent for Oracle (clustered) > > Parameters (* denotes required, [] the default): > > oracle_role* (string): Ora role > Required to assign the Oracle role. Must be "master" or "slave" > > Operations' defaults (advisory minimum): > > starttimeout=240 > promote timeout=90 > demote timeout=90 > notify timeout=90 > stop timeout=100 > monitor timeout=20 interval=20 depth=0 > monitor timeout=20 interval=10 depth=0 > > So it seems for the "stop" action there is a timeout of 100 seconds > defined. But at cluster shutdown i can see this in the ha-debug log: It says above that it's "advisory minimum" (the wording should probably be changed). You have to set the timeouts yourself. Thanks, Dejan > Jan 18 14:31:35 node1 crmd: [12844]: info: te_rsc_command: > Initiating action 5: stop oracle_primary_stop_0 on node1 (local) > Jan 18 14:31:35 node11 pengine: [12848]: notice: LogActions: Leave > resource oracle_secondary (Stopped) > Jan 18 14:31:35 node1 lrmd: [12841]: info: rsc:oracle_primary:7: stop > Jan 18 14:31:35 node1 crmd: [12844]: info: do_lrm_rsc_op: Performing > key=5:10:0:40ea1f42-c929-40d6-a0ed-569a7c8944bc > op=oracle_primary_stop_0 ) > Jan 18 14:31:35 node1 lrmd: [12841]: info: RA output: > (oracle_primary:stop:stderr) > /usr/lib/ocf/resource.d//heartbeat/cluster_oracle[247]: > Jan 18 14:31:35 node1 pengine: [12848]: WARN: process_pe_message: > Transition 10: WARNINGs found during PE processing. PEngine Input > stored in: /var/lib/pengine/pe-warn-2220.bz2 > Jan 18 14:31:35 node1 pengine: [12848]: info: process_pe_message: > Configuration WARNINGs found during PE processing. Please run > "crm_verify -L" to identify issues. > Jan 18 14:31:55 node1 lrmd: [12841]: WARN: oracle_primary:stop > process (PID 14386) timed out (try 1). Killing with signal SIGTERM > (15). > Jan 18 14:31:55 node1 lrmd: [12841]: info: RA output: > (oracle_primary:stop:stderr) > Session terminated, killing shell... > Jan 18 14:31:57 node1 lrmd: [12841]: info: RA output: > (oracle_primary:stop:stderr) ...killed. > > Apparently a timeout occured at the stop action after 20 seconds. > But why, if the resource defined 100 secs? > > With kind regards > Markus > > ___ > Pacemaker mailing list > Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker
[Pacemaker] metadata (timeout) ignored?
Hello, i've a question about metadata returned by an ocf resource agent using the "meta-data" command and the behaviour of the cluster. When checking the resource agent's metadata using crm i get this: # crm crm(live)# ra crm(live)ra# meta cluster_oracle ocf bla (ocf:heartbeat:cluster_oracle) Master/Slave OCF Resource Agent for Oracle (clustered) Parameters (* denotes required, [] the default): oracle_role* (string): Ora role Required to assign the Oracle role. Must be "master" or "slave" Operations' defaults (advisory minimum): starttimeout=240 promote timeout=90 demote timeout=90 notify timeout=90 stop timeout=100 monitor timeout=20 interval=20 depth=0 monitor timeout=20 interval=10 depth=0 So it seems for the "stop" action there is a timeout of 100 seconds defined. But at cluster shutdown i can see this in the ha-debug log: ... Jan 18 14:31:35 node1 crmd: [12844]: info: te_rsc_command: Initiating action 5: stop oracle_primary_stop_0 on node1 (local) Jan 18 14:31:35 node11 pengine: [12848]: notice: LogActions: Leave resource oracle_secondary (Stopped) Jan 18 14:31:35 node1 lrmd: [12841]: info: rsc:oracle_primary:7: stop Jan 18 14:31:35 node1 crmd: [12844]: info: do_lrm_rsc_op: Performing key=5:10:0:40ea1f42-c929-40d6-a0ed-569a7c8944bc op=oracle_primary_stop_0 ) Jan 18 14:31:35 node1 lrmd: [12841]: info: RA output: (oracle_primary:stop:stderr) /usr/lib/ocf/resource.d//heartbeat/cluster_oracle[247]: Jan 18 14:31:35 node1 pengine: [12848]: WARN: process_pe_message: Transition 10: WARNINGs found during PE processing. PEngine Input stored in: /var/lib/pengine/pe-warn-2220.bz2 Jan 18 14:31:35 node1 pengine: [12848]: info: process_pe_message: Configuration WARNINGs found during PE processing. Please run "crm_verify -L" to identify issues. Jan 18 14:31:55 node1 lrmd: [12841]: WARN: oracle_primary:stop process (PID 14386) timed out (try 1). Killing with signal SIGTERM (15). Jan 18 14:31:55 node1 lrmd: [12841]: info: RA output: (oracle_primary:stop:stderr) Session terminated, killing shell... Jan 18 14:31:57 node1 lrmd: [12841]: info: RA output: (oracle_primary:stop:stderr) ...killed. Apparently a timeout occured at the stop action after 20 seconds. But why, if the resource defined 100 secs? With kind regards Markus ___ Pacemaker mailing list Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker