I don't have autojoin in my ha.cf, and I believe it defaults to "autojoin none", so that wouldn't explain why heartbeat keeps waiting after all nodes have joined.
I can see in /var/log/messages where crmd is doing the waiting for my 900-second initdead: 2010-01-11T13:51:15.428916-05:00 crmd: [4273]: info: do_started: The local CRM is operational 2010-01-11T13:51:15.428924-05:00 crmd: [4273]: info: do_state_transition: State transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ] 2010-01-11T14:06:15.964307-05:00 crmd: [4273]: info: crm_timer_popped: Election Trigger (I_DC_TIMEOUT) just popped! 2010-01-11T14:06:15.964337-05:00 crmd: [4273]: WARN: do_log: [[FSA]] Input I_DC_TIMEOUT from crm_timer_popped() received in state (S_PENDING) 2010-01-11T14:06:15.964348-05:00 crmd: [4273]: info: do_state_transition: State transition S_PENDING -> S_ELECTION [ input=I_DC_TIMEOUT cause=C_TIMER_POPPED origin=crm_timer_popped ] I am using "Version 2 Resource Manager". I didn't previously realize this was the last version before the split. I am also using DRBD, and yesterday I discovered that its wait-for-connection timeout (wfc-timeout) works as I had hoped initdead would, and by putting it before heartbeat in the startup sequence, it turns out I don't really need initdead after all. Thanks, David -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Dejan Muhamedagic Sent: Tuesday, January 12, 2010 3:51 AM To: General Linux-HA mailing list Subject: Re: [Linux-HA] heartbeat waits for initdead even after all nodes have joined Hi, On Mon, Jan 11, 2010 at 03:21:05PM -0500, David Sickmiller wrote: > Hi, > > > > I was hoping to configure my 2-node cluster to start as soon as both > nodes were present but wait up to 15 minutes if the other node was > missing upon system startup. In my case, a delay of several minutes is > better than a split-brain scenario. The Linux-HA documentation says > "The initdead parameter is used to set the time that it takes to declare > a cluster node dead when Heartbeat is first started.", so I figured I > could just set "initdead 900" in ha.cf. Unfortunately, heartbeat seems > to be waiting for the entire initdead time interval regardless of > whether all the nodes are present. > > > > Does this match others' experiences? Is there a different setting that > could accomplish my objective? > > > > It seems like the documentation would be more accurate if it said "The > initdead parameter is used to set the time that heartbeat waits before > starting any resources, which allows time for additional nodes to join." If you have autojoin set to "any". > However, I would much prefer that Linux-HA behaved according to the > original documentation. > > > > I'm using Heartbeat 2.1.4 on RHEL 5.4. Please switch to Pacemaker/heartbeat or Pacemaker/corosync. Or are you using v1/haresources? Thanks, Dejan _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
