so why to use some other fencing method like disablink port on switch, so nobody can acces faultly node and write data to it. it is common practice too.
S pozdravem Kristián Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: supp...@feldhost.cz www.feldhost.cz - FeldHost™ – profesionální hostingové a serverové služby za adekvátní ceny. FELDSAM s.r.o. V rohu 434/3 Praha 4 – Libuš, PSČ 142 00 IČ: 290 60 958, DIČ: CZ290 60 958 C 200350 vedená u Městského soudu v Praze Banka: Fio banka a.s. Číslo účtu: 2400330446/2010 BIC: FIOBCZPPXX IBAN: CZ82 2010 0000 0024 0033 0446 > On 24 Jul 2017, at 21:16, Klaus Wenninger <kwenn...@redhat.com> wrote: > > On 07/24/2017 08:27 PM, Prasad, Shashank wrote: >> My understanding is that SBD will need a shared storage between clustered >> nodes. >> And that, SBD will need at least 3 nodes in a cluster, if using w/o shared >> storage. > > Haven't tried to be honest but reason for 3 nodes is that without > shared disk you need a real quorum-source and not something > 'faked' as with 2-node-feature in corosync. > But I don't see anything speaking against getting the proper > quorum via qdevice instead with a third full cluster-node. > >> >> Therefore, for systems which do NOT use shared storage between 1+1 HA >> clustered nodes, SBD may NOT be an option. >> Correct me, if I am wrong. >> >> For cluster systems using the likes of iDRAC/IMM2 fencing agents, which have >> redundant but shared power supply units with the nodes, the normal fencing >> mechanisms should work for all resiliency scenarios, but for IMM2/iDRAC are >> being NOT reachable for whatsoever reasons. And, to bail out of those >> situations in the absence of SBD, I believe using used-defined failover >> hooks (via scripts) into Pacemaker Alerts, with sudo permissions for >> ‘hacluster’, should help. > > If you don't see your fencing device assuming after some time > the the corresponding node will probably be down is quite risky > in my opinion. > But why not assure it to be down using a watchdog? > >> >> Thanx. >> >> >> From: Klaus Wenninger [mailto:kwenn...@redhat.com >> <mailto:kwenn...@redhat.com>] >> Sent: Monday, July 24, 2017 11:31 PM >> To: Cluster Labs - All topics related to open-source clustering welcomed; >> Prasad, Shashank >> Subject: Re: [ClusterLabs] Two nodes cluster issue >> >> On 07/24/2017 07:32 PM, Prasad, Shashank wrote: >> Sometimes IPMI fence devices use shared power of the node, and it cannot be >> avoided. >> In such scenarios the HA cluster is NOT able to handle the power failure of >> a node, since the power is shared with its own fence device. >> The failure of IPMI based fencing can also exist due to other reasons also. >> >> A failure to fence the failed node will cause cluster to be marked UNCLEAN. >> To get over it, the following command needs to be invoked on the surviving >> node. >> >> pcs stonith confirm <failed_node_name> --force >> >> This can be automated by hooking a recovery script, when the the Stonith >> resource ‘Timed Out’ event. >> To be more specific, the Pacemaker Alerts can be used for watch for Stonith >> timeouts and failures. >> In that script, all that’s essentially to be executed is the aforementioned >> command. >> >> If I get you right here you can disable fencing then in the first place. >> Actually quorum-based-watchdog-fencing is the way to do this in a >> safe manner. This of course assumes you have a proper source for >> quorum in your 2-node-setup with e.g. qdevice or using a shared >> disk with sbd (not directly pacemaker quorum here but similar thing >> handled inside sbd). >> >> >> Since the alerts are issued from ‘hacluster’ login, sudo permissions for >> ‘hacluster’ needs to be configured. >> >> Thanx. >> >> >> From: Klaus Wenninger [mailto:kwenn...@redhat.com >> <mailto:kwenn...@redhat.com>] >> Sent: Monday, July 24, 2017 9:24 PM >> To: Kristián Feldsam; Cluster Labs - All topics related to open-source >> clustering welcomed >> Subject: Re: [ClusterLabs] Two nodes cluster issue >> >> On 07/24/2017 05:37 PM, Kristián Feldsam wrote: >> I personally think that power off node by switched pdu is more safe, or not? >> >> True if that is working in you environment. If you can't do a physical setup >> where you aren't simultaneously loosing connection to both your node and >> the switch-device (or you just want to cover cases where that happens) >> you have to come up with something else. >> >> >> >> >> S pozdravem Kristián Feldsam >> Tel.: +420 773 303 353, +421 944 137 535 >> E-mail.: supp...@feldhost.cz <mailto:supp...@feldhost.cz> >> >> www.feldhost.cz <http://www.feldhost.cz/> - FeldHost™ – profesionální >> hostingové a serverové služby za adekvátní ceny. >> >> FELDSAM s.r.o. >> V rohu 434/3 >> Praha 4 – Libuš, PSČ 142 00 >> IČ: 290 60 958, DIČ: CZ290 60 958 >> C 200350 vedená u Městského soudu v Praze >> >> Banka: Fio banka a.s. >> Číslo účtu: 2400330446/2010 >> BIC: FIOBCZPPXX >> IBAN: CZ82 2010 0000 0024 0033 0446 >> >> On 24 Jul 2017, at 17:27, Klaus Wenninger <kwenn...@redhat.com >> <mailto:kwenn...@redhat.com>> wrote: >> >> On 07/24/2017 05:15 PM, Tomer Azran wrote: >> I still don't understand why the qdevice concept doesn't help on this >> situation. Since the master node is down, I would expect the quorum to >> declare it as dead. >> Why doesn't it happens? >> >> That is not how quorum works. It just limits the decision-making to the >> quorate subset of the cluster. >> Still the unknown nodes are not sure to be down. >> That is why I suggested to have quorum-based watchdog-fencing with sbd. >> That would assure that within a certain time all nodes of the non-quorate >> part >> of the cluster are down. >> >> >> >> >> >> >> >> On Mon, Jul 24, 2017 at 4:15 PM +0300, "Dmitri Maziuk" >> <dmitri.maz...@gmail.com <mailto:dmitri.maz...@gmail.com>> wrote: >> >> On 2017-07-24 07:51, Tomer Azran wrote: >> > We don't have the ability to use it. >> > Is that the only solution? >> >> No, but I'd recommend thinking about it first. Are you sure you will >> care about your cluster working when your server room is on fire? 'Cause >> unless you have halon suppression, your server room is a complete >> write-off anyway. (Think water from sprinklers hitting rich chunky volts >> in the servers.) >> >> Dima >> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >> http://lists.clusterlabs.org/mailman/listinfo/users >> <http://lists.clusterlabs.org/mailman/listinfo/users> >> >> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >> >> >> >> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >> http://lists.clusterlabs.org/mailman/listinfo/users >> <http://lists.clusterlabs.org/mailman/listinfo/users> >> >> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >> >> -- >> Klaus Wenninger >> >> Senior Software Engineer, EMEA ENG Openstack Infrastructure >> >> Red Hat >> >> kwenn...@redhat.com <mailto:kwenn...@redhat.com> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >> http://lists.clusterlabs.org/mailman/listinfo/users >> <http://lists.clusterlabs.org/mailman/listinfo/users> >> >> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >> >> >> >> >> _______________________________________________ >> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> >> http://lists.clusterlabs.org/mailman/listinfo/users >> <http://lists.clusterlabs.org/mailman/listinfo/users> >> >> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> >> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> >> > > _______________________________________________ > Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org> > http://lists.clusterlabs.org/mailman/listinfo/users > <http://lists.clusterlabs.org/mailman/listinfo/users> > > Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> > Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/>
_______________________________________________ Users mailing list: Users@clusterlabs.org http://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org