Serge, thanks for the quick response (and missing flame :)). I've added to the server primitive:
<operations> <op id="1" name="stop" timeout="20s"/> <op id="2" name="start" timeout="20s"/> </operations> but still gets timeout. At the risk of exposing my stupidity, here are more details: I've added to the server script some ocf_log calls, as in the first two lines below containing (lew). My start_server function looks as follows: start_server() { ocf_log info "(lew) in /usr/lib/ocf/resource.d/heartbeat server start_server function" instance=`echo $OCF_RESOURCE_INSTANCE` ocf_log info "(lew) instance = $instance" ocf_run /lew/server h ocf_log info "(lew) in /usr/lib/ocf/resource.d/heartbeat server start_server function, return from ocf_run" return $OCF_SUCCESS } Notice, the return from ocf_run is not logged below. So, maybe I've some Unix daemon coding issue. But, the app is pretty trivial, like I said. It just forks a child and the parent calls exit(0). server[23250]: 2008/01/02_21:22:50 INFO: (lew) in /usr/lib/ocf/resource.d/heartbeat server start_server function server[23250]: 2008/01/02_21:22:50 INFO: (lew) instance = server_value2 cib[16393]: 2008/01/02_21:22:51 WARN: do_cib_notify: cib_modify of <nvpair > FAILED: The object/attribute does not exist cib[16393]: 2008/01/02_21:22:51 ERROR: cib_process_request: cib_modify operation failed: The object/attribute does not exist cib[16393]: 2008/01/02_21:22:51 info: crm_log_message_adv: #========= Input message message start ==========# cib[16393]: 2008/01/02_21:22:51 info: MSG: Dumping message with 20 fields cib[16393]: 2008/01/02_21:22:51 info: MSG[0] : [t=cib] cib[16393]: 2008/01/02_21:22:51 info: MSG[1] : [cib_clientid=ab553d5e-f9b9-459d-88b2-0e9de0bf9e59] cib[16393]: 2008/01/02_21:22:51 info: MSG[2] : [cib_callopt=1048576] cib[16393]: 2008/01/02_21:22:51 info: MSG[3] : [cib_callid=153] cib[16393]: 2008/01/02_21:22:51 info: MSG[4] : [cib_op=cib_modify] cib[16393]: 2008/01/02_21:22:51 info: MSG[5] : [cib_section=status] cib[16393]: 2008/01/02_21:22:51 info: MSG[6] : [cib_clientname=961] cib[16393]: 2008/01/02_21:22:51 info: MSG[7] : [(5)cib_calldata=0x806a490(114 136)] cib[16393]: 2008/01/02_21:22:51 info: <nvpair id="status-41b0e7f1-55ca-472e-8ea0-f7acb9e99613-pingd" name="pingd" value="0"/> cib[16393]: 2008/01/02_21:22:51 info: MSG[8] : [cib_delegated_from=c001n01] cib[16393]: 2008/01/02_21:22:51 info: MSG[9] : [from_id=cib] cib[16393]: 2008/01/02_21:22:51 info: MSG[10] : [to_id=cib] cib[16393]: 2008/01/02_21:22:51 info: MSG[11] : [client_gen=5] cib[16393]: 2008/01/02_21:22:51 info: MSG[12] : [src=c001n01] cib[16393]: 2008/01/02_21:22:51 info: MSG[13] : [(1)srcuuid=0x807f408(36 27)] cib[16393]: 2008/01/02_21:22:51 info: MSG[14] : [seq=2c9f1] cib[16393]: 2008/01/02_21:22:51 info: MSG[15] : [hg=47684a1b] cib[16393]: 2008/01/02_21:22:51 info: MSG[16] : [ts=477c00ab] cib[16393]: 2008/01/02_21:22:51 info: MSG[17] : [ld=0.24 0.28 0.21 5/157 19634] cib[16393]: 2008/01/02_21:22:51 info: MSG[18] : [ttl=4] cib[16393]: 2008/01/02_21:22:51 info: MSG[19] : [_compression_algorithm=zlib] tengine[18635]: 2008/01/02_21:23:10 WARN: action_timer_callback: Timer popped (abort_level=1000000, complete=false) tengine[18635]: 2008/01/02_21:23:10 WARN: print_elem: Action missed its timeout[Action 5]: In-flight (id: server2_start_0, loc: c001n02, priority: 0) lrmd[16394]: 2008/01/02_21:23:10 WARN: server2:start process (PID 23238) timed out (try 1). Killing with signal SIGTERM (15). -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Serge Dubrouski Sent: Wednesday, January 02, 2008 4:25 PM To: General Linux-HA mailing list Subject: Re: [Linux-HA] daemon timeout trying to use ocf to startup Looks like your OCF "server" script wasn't able to start server in a given time. On Jan 2, 2008 1:47 PM, <[EMAIL PROTECTED]> wrote: > Well you all seem like a friendly enough bunch as I lurk about the list, > so here goes... > I've read some fine Linux-HA (V2) tutorials and have begun experimenting > with Linux-HA on a 2 node setup. Installation of heartbeat went well > and I even glimpsed ip failover in action. Now I am attempting to use > ocf to launch a test daemon of mine, by mimicing the apache scripts > d/l'd with V2. > > >From log debug o/p, I see ocf_run launching my daemon, but then see a > timeout in ha-log: > > tengine[23602]: 2008/01/02_17:56:46 WARN: print_elem: Action missed > its timeout[Action 6]: In-flight (id: server1_start_0, loc: c001n02, > priority: 0) > > I'll be the first to admit, this app is not production qual. It simply > forks a child that sits and waits on a listen. I could not imagine > there was more needed for this experiment but maybe there is. > > o/p from crm_verify also reveals: > > crm_verify[26701]: 2008/01/02_19:40:11 WARN: unpack_rsc_op: > Processing failed op server2_start_0 on c001n02: Timed Out > crm_verify[26701]: 2008/01/02_19:40:11 WARN: unpack_rsc_op: > Compatability handling for failed op server2_start_0 on c001n02 > crm_verify[26701]: 2008/01/02_19:40:11 WARN: native_color: Resource > server2 cannot run anywhere > > Here is the related portion of the cib.xml: > > <primitive id="server2" class="heartbeat" type="server"> > <instance_attributes id="ia2_s2"> > <attributes> > <nvpair id="s2" name="1" value="value2"/> > </attributes> > </instance_attributes> > </primitive> > . > . > . > <constraints> > <rsc_location id="run_ip_resource_2" rsc="server2"> > <rule id="pref_run_ip_resource_2" score="100"> > <expression id="e2" attribute="#hostname" operation="eq" > value="c001n02"/> > </rule> > </rsc_location> > </constraints> > > > The type "server", as I alluded, mimics the apache scripts provided in > the heartbeat d/l. I launch the app using ocf_run, via the server > script deposited in /usr/lib/ocf/resource.d/heartbeat/server. > > Can someone give me a clue why the timeout occurs ? > > Thanks alot, > lew > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > -- Serge Dubrouski. _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems