On 05/25/2018 12:44 PM, Andrei Borzenkov wrote: > On Fri, May 25, 2018 at 10:08 AM, Klaus Wenninger <kwenn...@redhat.com> wrote: >> On 05/25/2018 07:31 AM, 井上 和徳 wrote: >>> Hi, >>> >>> I am checking the watchdog function of SBD (without shared block-device). >>> In a two-node cluster, if one cluster is stopped, watchdog is triggered on >>> the remaining node. >>> Is this the designed behavior? >> SBD without a shared block-device doesn't really make sense on >> a two-node cluster. >> The basic idea is - e.g. in a case of a networking problem - >> that a cluster splits up in a quorate and a non-quorate partition. >> The quorate partition stays over while SBD guarantees a >> reliable watchdog-based self-fencing of the non-quorate partition >> within a defined timeout. > Does it require no-quorum-policy=suicide or it decides completely > independently? I.e. would it fire also with no-quorum-policy=ignore?
Finally it will in any case. But no-quorum-policy decides how long this will take. In case of suicide the inquisitor will immediately stop tickling the watchdog. In all other cases the pacemaker-servant will stop pinging the inquisitor which will makes the servant timeout after a default of 4 seconds and then the inquisitor will stop tickling the watchdog. But that is just relevant if Corosync doesn't have 2-node enabled. See the comment below for that case. > >> This idea of course doesn't work with just 2 nodes. >> Taking quorum info from the 2-node feature of corosync (automatically >> switching on wait-for-all) doesn't help in this case but instead >> would lead to split-brain. > So what you are saying is that SBD ignores quorum information from > corosync and takes its own decisions based on pure count of nodes. Do > I understand it correctly? Yes, but that is just true for this case where Corosync has 2-node enabled. In all other cases (might it be clusters with more than 2 nodes or clusters with just 2 nodes but without 2-node enabled in Corosync) pacemaker-servant takes quorum-info from pacemaker, which will probably come directly from Corosync nowadays. But as said if 2-node is configured with Corosync everything is different: The node-counting is then actually done by the cluster-servant and this one will stop pinging the inquisitor (instead of the pacemaker-servant) if it doesn't count more than 1 node. That all said I've just realized that setting 2-node in Corosync shouldn't really be dangerous anymore although it doesn't make the cluster especially useful either in case of SBD without disk(s). Regards, Klaus > >> What you can do - and what e.g. pcs does automatically - is enable >> the auto-tie-breaker instead of two-node in corosync. But that >> still doesn't give you a higher availability than the one of the >> winner of auto-tie-breaker. (Maybe interesting if you are going >> for a load-balancing-scenario that doesn't affect availability or >> for a transient state while setting up a cluste node-by-node ...) >> What you can do though is using qdevice to still have 'real-quorum' >> info with just 2 full cluster-nodes. >> >> There was quite a lot of discussion round this topic on this >> thread previously if you search the history. >> >> Regards, >> Klaus >> >>> [vmrh75b]# cat /etc/corosync/corosync.conf >>> (snip) >>> quorum { >>> provider: corosync_votequorum >>> two_node: 1 >>> } >>> >>> [vmrh75b]# cat /etc/sysconfig/sbd >>> # This file has been generated by pcs. >>> SBD_DELAY_START=no >>> ## SBD_DEVICE="/dev/vdb1" >>> SBD_OPTS="-vvv" >>> SBD_PACEMAKER=yes >>> SBD_STARTMODE=always >>> SBD_WATCHDOG_DEV=/dev/watchdog >>> SBD_WATCHDOG_TIMEOUT=5 >>> >>> [vmrh75b]# crm_mon -r1 >>> Stack: corosync >>> Current DC: vmrh75a (version 2.0.0-0.1.rc4.el7-2.0.0-rc4) - partition with >>> quorum >>> Last updated: Fri May 25 13:36:07 2018 >>> Last change: Fri May 25 13:35:22 2018 by root via cibadmin on vmrh75a >>> >>> 2 nodes configured >>> 0 resources configured >>> >>> Online: [ vmrh75a vmrh75b ] >>> >>> No resources >>> >>> [vmrh75b]# pcs property show >>> Cluster Properties: >>> cluster-infrastructure: corosync >>> cluster-name: my_cluster >>> dc-version: 2.0.0-0.1.rc4.el7-2.0.0-rc4 >>> have-watchdog: true >>> stonith-enabled: false >>> >>> [vmrh75b]# ps -ef | egrep "sbd|coro|pace" >>> root 2169 1 0 13:34 ? 00:00:00 sbd: inquisitor >>> root 2170 2169 0 13:34 ? 00:00:00 sbd: watcher: Pacemaker >>> root 2171 2169 0 13:34 ? 00:00:00 sbd: watcher: Cluster >>> root 2172 1 0 13:34 ? 00:00:00 corosync >>> root 2179 1 0 13:34 ? 00:00:00 /usr/sbin/pacemakerd -f >>> haclust+ 2180 2179 0 13:34 ? 00:00:00 >>> /usr/libexec/pacemaker/pacemaker-based >>> root 2181 2179 0 13:34 ? 00:00:00 >>> /usr/libexec/pacemaker/pacemaker-fenced >>> root 2182 2179 0 13:34 ? 00:00:00 >>> /usr/libexec/pacemaker/pacemaker-execd >>> haclust+ 2183 2179 0 13:34 ? 00:00:00 >>> /usr/libexec/pacemaker/pacemaker-attrd >>> haclust+ 2184 2179 0 13:34 ? 00:00:00 >>> /usr/libexec/pacemaker/pacemaker-schedulerd >>> haclust+ 2185 2179 0 13:34 ? 00:00:00 >>> /usr/libexec/pacemaker/pacemaker-controld >>> >>> [vmrh75b]# pcs cluster stop vmrh75a >>> vmrh75a: Stopping Cluster (pacemaker)... >>> vmrh75a: Stopping Cluster (corosync)... >>> >>> [vmrh75b]# tail -F /var/log/messages >>> May 25 13:37:00 vmrh75b pacemaker-controld[2185]: notice: Our peer on the >>> DC (vmrh75a) is dead >>> May 25 13:37:00 vmrh75b pacemaker-controld[2185]: notice: State transition >>> S_NOT_DC -> S_ELECTION >>> May 25 13:37:00 vmrh75b pacemaker-controld[2185]: notice: State transition >>> S_ELECTION -> S_INTEGRATION >>> May 25 13:37:00 vmrh75b pacemaker-attrd[2183]: notice: Node vmrh75a state >>> is now lost >>> May 25 13:37:00 vmrh75b pacemaker-attrd[2183]: notice: Removing all vmrh75a >>> attributes for peer loss >>> May 25 13:37:00 vmrh75b pacemaker-attrd[2183]: notice: Lost attribute >>> writer vmrh75a >>> May 25 13:37:00 vmrh75b pacemaker-attrd[2183]: notice: Purged 1 peer with >>> id=1 and/or uname=vmrh75a from the membership cache >>> May 25 13:37:00 vmrh75b pacemaker-fenced[2181]: notice: Node vmrh75a state >>> is now lost >>> May 25 13:37:00 vmrh75b pacemaker-fenced[2181]: notice: Purged 1 peer with >>> id=1 and/or uname=vmrh75a from the membership cache >>> May 25 13:37:00 vmrh75b pacemaker-based[2180]: notice: Node vmrh75a state >>> is now lost >>> May 25 13:37:00 vmrh75b pacemaker-based[2180]: notice: Purged 1 peer with >>> id=1 and/or uname=vmrh75a from the membership cache >>> May 25 13:37:00 vmrh75b pacemaker-controld[2185]: warning: Input >>> I_ELECTION_DC received in state S_INTEGRATION from do_election_check >>> May 25 13:37:01 vmrh75b sbd[2171]: cluster: warning: set_servant_health: >>> Connected to corosync but requires both nodes present >>> May 25 13:37:01 vmrh75b sbd[2171]: cluster: warning: notify_parent: >>> Notifying parent: UNHEALTHY (6) >>> May 25 13:37:01 vmrh75b sbd[2169]: warning: inquisitor_child: cluster >>> health check: UNHEALTHY >>> May 25 13:37:01 vmrh75b sbd[2169]: warning: inquisitor_child: Servant >>> cluster is outdated (age: 226) >>> May 25 13:37:01 vmrh75b sbd[2170]: pcmk: notice: unpack_config: >>> Watchdog will be used via SBD if fencing is required >>> May 25 13:37:01 vmrh75b sbd[2170]: pcmk: info: >>> determine_online_status: Node vmrh75b is online >>> May 25 13:37:01 vmrh75b sbd[2170]: pcmk: info: unpack_node_loop: >>> Node 2 is already processed >>> May 25 13:37:01 vmrh75b sbd[2170]: pcmk: info: unpack_node_loop: >>> Node 2 is already processed >>> May 25 13:37:01 vmrh75b sbd[2171]: cluster: warning: notify_parent: >>> Notifying parent: UNHEALTHY (6) >>> May 25 13:37:01 vmrh75b corosync[2172]: [TOTEM ] A new membership >>> (192.168.28.132:5712) was formed. Members left: 1 >>> May 25 13:37:01 vmrh75b corosync[2172]: [QUORUM] Members[1]: 2 >>> May 25 13:37:01 vmrh75b corosync[2172]: [MAIN ] Completed service >>> synchronization, ready to provide service. >>> May 25 13:37:01 vmrh75b pacemakerd[2179]: notice: Node vmrh75a state is now >>> lost >>> May 25 13:37:01 vmrh75b pacemaker-controld[2185]: notice: Node vmrh75a >>> state is now lost >>> May 25 13:37:01 vmrh75b pacemaker-controld[2185]: warning: Stonith/shutdown >>> of node vmrh75a was not expected >>> May 25 13:37:02 vmrh75b sbd[2171]: cluster: warning: notify_parent: >>> Notifying parent: UNHEALTHY (6) >>> May 25 13:37:02 vmrh75b pacemaker-schedulerd[2184]: notice: Watchdog will >>> be used via SBD if fencing is required >>> May 25 13:37:02 vmrh75b pacemaker-schedulerd[2184]: warning: Blind faith: >>> not fencing unseen nodes >>> May 25 13:37:02 vmrh75b pacemaker-schedulerd[2184]: notice: Delaying >>> fencing operations until there are resources to manage >>> May 25 13:37:02 vmrh75b pacemaker-schedulerd[2184]: notice: Calculated >>> transition 0, saving inputs in /var/lib/pacemaker/pengine/pe-input-1410.bz2 >>> May 25 13:37:02 vmrh75b pacemaker-controld[2185]: notice: Transition 0 >>> (Complete=0, Pending=0, Fired=0, Skipped=0, Incomplete=0, >>> Source=/var/lib/pacemaker/pengine/pe-input-1410.bz2): Complete >>> May 25 13:37:02 vmrh75b pacemaker-controld[2185]: notice: State transition >>> S_TRANSITION_ENGINE -> S_IDLE >>> May 25 13:37:03 vmrh75b sbd[2171]: cluster: warning: notify_parent: >>> Notifying parent: UNHEALTHY (6) >>> May 25 13:37:03 vmrh75b sbd[2170]: pcmk: notice: unpack_config: >>> Watchdog will be used via SBD if fencing is required >>> May 25 13:37:03 vmrh75b sbd[2170]: pcmk: info: >>> determine_online_status: Node vmrh75b is online >>> May 25 13:37:03 vmrh75b sbd[2170]: pcmk: info: unpack_node_loop: >>> Node 2 is already processed >>> May 25 13:37:03 vmrh75b sbd[2170]: pcmk: info: unpack_node_loop: >>> Node 2 is already processed >>> May 25 13:37:04 vmrh75b sbd[2171]: cluster: warning: notify_parent: >>> Notifying parent: UNHEALTHY (6) >>> May 25 13:37:05 vmrh75b sbd[2169]: warning: inquisitor_child: Latency: No >>> liveness for 4 s exceeds threshold of 3 s (healthy servants: 0) >>> May 25 13:37:05 vmrh75b sbd[2171]: cluster: warning: notify_parent: >>> Notifying parent: UNHEALTHY (6) >>> May 25 13:37:05 vmrh75b sbd[2169]: warning: inquisitor_child: Latency: No >>> liveness for 4 s exceeds threshold of 3 s (healthy servants: 0) >>> >>> Best Regards, >>> Kazunori INOUE >>> _______________________________________________ >>> Users mailing list: Users@clusterlabs.org >>> https://lists.clusterlabs.org/mailman/listinfo/users >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org