>>> Jorge Fábregas <jorge.fabre...@gmail.com> schrieb am 08.09.2015 um 17:45 in Nachricht <55ef029c.3000...@gmail.com>: > Hi, > > I've read about how important is the relationship between the different > parameters of the SBD device (msgwait & watchdog timeout) & Pacemaker's > stonith timeout. However I've just encountered something that I never > considered: the time elapsed until a node is fully up (after being > fenced) against msgwait. > > Two nodes: sles11a & sles11b. I fenced sles11a (via Hawk's interface > that triggers the sbd resource agent) and watched carefully > /var/log/messages on sles11b: > > > Sept 8 11:27:00 sles11b sbd: Writing reset to node slot sles11a > Sept 8 11:27:00 sles11b sbd: Messaging delay: 40 > > [sles11a is rebooting and it comes up in about 12 seconds]
Lucky you (for the fast reboot time), but you have a problem: 1) the msgwait has to be long enough to make (as close as possible to) 100% sure that the node is down when the time has expired. Then the cluster will perform recovery operationms for the down node. If the node is up earlier and joined the cluster, things way be in somewhat disorder. 2) The msgwait has to be long enough to make sure the SBD commands are delivered even if a disk needs some retries, or your storage system is slow while being online (this could mean you do an "online" firmware upgrade where the system won't respond for a few seconds). May guess woule be to increase the node boot time and to decreate the msgwait to somethink like 30 seconds. Usually you have SCSI timeouts around one minute. Also remember that parts of the OS will retry I/O for some time before flagging an error to the application. > > [see a bunch of messages joining the cluster] > > [finally node sles11a is online at about 11:27:25] > > Sept 8 11:27:40 sles11b sbd: Message successfully delivered > > [sles11a is put offline!] > > Sept 8 11:27:41 pengine[4358]: warning: custom_action: Action > p_stonith-sdb_monitor_0 on sles11a > is unrunnable (pending) This is when the node is up and online, but fencing still isn't confirmed? > > I've done it about 5 times and it happens every time. > > My values are: 20 (watchdog timeout) & 40 (msgwait). I know I > know..it's too much for my lab environment but I'm just curious if > there's something wrong or if indeed msgwait NEEDS to be ALWAYS less > than reboot-time. If you want to have an exciting configuration, you could try to get watchdog timeout down to 5 seconds or so, and shorten the msgwait (and possibly other dependign parameters). But make sure support accepts such short values. BTW: We have a msgwait close to 3 minutes, allowing the storage to be not responding for up to 60 seconds. The difference is a safety margin for possible retries... Our physical hosts hardly boot in less than 4 minutes. Regards, Ulrich _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org