Re: [ClusterLabs] Two nodes cluster issue

Kristián Feldsam Mon, 24 Jul 2017 12:50:01 -0700

so why to use some other fencing method like disablink port on switch, so 
nobody can acces faultly node and write data to it. it is common practice too.


S pozdravem Kristián Feldsam
Tel.: +420 773 303 353, +421 944 137 535
E-mail.: supp...@feldhost.cz

www.feldhost.cz - FeldHost™ – profesionální hostingové a serverové služby za 
adekvátní ceny.

FELDSAM s.r.o.
V rohu 434/3
Praha 4 – Libuš, PSČ 142 00
IČ: 290 60 958, DIČ: CZ290 60 958
C 200350 vedená u Městského soudu v Praze

Banka: Fio banka a.s.
Číslo účtu: 2400330446/2010
BIC: FIOBCZPPXX
IBAN: CZ82 2010 0000 0024 0033 0446

> On 24 Jul 2017, at 21:16, Klaus Wenninger <kwenn...@redhat.com> wrote:
> 
> On 07/24/2017 08:27 PM, Prasad, Shashank wrote:
>> My understanding is that  SBD will need a shared storage between clustered 
>> nodes.
>> And that, SBD will need at least 3 nodes in a cluster, if using w/o shared 
>> storage.
> 
> Haven't tried to be honest but reason for 3 nodes is that without
> shared disk you need a real quorum-source and not something
> 'faked' as with 2-node-feature in corosync.
> But I don't see anything speaking against getting the proper
> quorum via qdevice instead with a third full cluster-node.
> 
>>  
>> Therefore, for systems which do NOT use shared storage between 1+1 HA 
>> clustered nodes, SBD may NOT be an option.
>> Correct me, if I am wrong.
>>  
>> For cluster systems using the likes of iDRAC/IMM2 fencing agents, which have 
>> redundant but shared power supply units with the nodes, the normal fencing 
>> mechanisms should work for all resiliency scenarios, but for IMM2/iDRAC are 
>> being NOT reachable for whatsoever reasons. And, to bail out of those 
>> situations in the absence of SBD, I believe using used-defined failover 
>> hooks (via scripts) into Pacemaker Alerts, with sudo permissions for 
>> ‘hacluster’, should help.
> 
> If you don't see your fencing device assuming after some time
> the the corresponding node will probably be down is quite risky
> in my opinion.
> But why not assure it to be down using a watchdog?
> 
>>  
>> Thanx.
>>  
>>  
>> From: Klaus Wenninger [mailto:kwenn...@redhat.com 
>> <mailto:kwenn...@redhat.com>] 
>> Sent: Monday, July 24, 2017 11:31 PM
>> To: Cluster Labs - All topics related to open-source clustering welcomed; 
>> Prasad, Shashank
>> Subject: Re: [ClusterLabs] Two nodes cluster issue
>>  
>> On 07/24/2017 07:32 PM, Prasad, Shashank wrote:
>> Sometimes IPMI fence devices use shared power of the node, and it cannot be 
>> avoided.
>> In such scenarios the HA cluster is NOT able to handle the power failure of 
>> a node, since the power is shared with its own fence device.
>> The failure of IPMI based fencing can also exist due to other reasons also.
>>  
>> A failure to fence the failed node will cause cluster to be marked UNCLEAN.
>> To get over it, the following command needs to be invoked on the surviving 
>> node.
>>  
>> pcs stonith confirm <failed_node_name> --force
>>  
>> This can be automated by hooking a recovery script, when the the Stonith 
>> resource ‘Timed Out’ event.
>> To be more specific, the Pacemaker Alerts can be used for watch for Stonith 
>> timeouts and failures.
>> In that script, all that’s essentially to be executed is the aforementioned 
>> command.
>> 
>> If I get you right here you can disable fencing then in the first place.
>> Actually quorum-based-watchdog-fencing is the way to do this in a
>> safe manner. This of course assumes you have a proper source for
>> quorum in your 2-node-setup with e.g. qdevice or using a shared
>> disk with sbd (not directly pacemaker quorum here but similar thing
>> handled inside sbd).
>> 
>> 
>> Since the alerts are issued from ‘hacluster’ login, sudo permissions for 
>> ‘hacluster’ needs to be configured.
>>  
>> Thanx.
>>  
>>  
>> From: Klaus Wenninger [mailto:kwenn...@redhat.com 
>> <mailto:kwenn...@redhat.com>] 
>> Sent: Monday, July 24, 2017 9:24 PM
>> To: Kristián Feldsam; Cluster Labs - All topics related to open-source 
>> clustering welcomed
>> Subject: Re: [ClusterLabs] Two nodes cluster issue
>>  
>> On 07/24/2017 05:37 PM, Kristián Feldsam wrote:
>> I personally think that power off node by switched pdu is more safe, or not?
>> 
>> True if that is working in you environment. If you can't do a physical setup
>> where you aren't simultaneously loosing connection to both your node and
>> the switch-device (or you just want to cover cases where that happens)
>> you have to come up with something else.
>> 
>> 
>> 
>> 
>> S pozdravem Kristián Feldsam
>> Tel.: +420 773 303 353, +421 944 137 535
>> E-mail.: supp...@feldhost.cz <mailto:supp...@feldhost.cz>
>> 
>> www.feldhost.cz <http://www.feldhost.cz/> - FeldHost™ – profesionální 
>> hostingové a serverové služby za adekvátní ceny.
>> 
>> FELDSAM s.r.o.
>> V rohu 434/3
>> Praha 4 – Libuš, PSČ 142 00
>> IČ: 290 60 958, DIČ: CZ290 60 958
>> C 200350 vedená u Městského soudu v Praze
>> 
>> Banka: Fio banka a.s.
>> Číslo účtu: 2400330446/2010
>> BIC: FIOBCZPPXX
>> IBAN: CZ82 2010 0000 0024 0033 0446
>>  
>> On 24 Jul 2017, at 17:27, Klaus Wenninger <kwenn...@redhat.com 
>> <mailto:kwenn...@redhat.com>> wrote:
>>  
>> On 07/24/2017 05:15 PM, Tomer Azran wrote:
>> I still don't understand why the qdevice concept doesn't help on this 
>> situation. Since the master node is down, I would expect the quorum to 
>> declare it as dead.
>> Why doesn't it happens?
>> 
>> That is not how quorum works. It just limits the decision-making to the 
>> quorate subset of the cluster.
>> Still the unknown nodes are not sure to be down.
>> That is why I suggested to have quorum-based watchdog-fencing with sbd.
>> That would assure that within a certain time all nodes of the non-quorate 
>> part
>> of the cluster are down.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Mon, Jul 24, 2017 at 4:15 PM +0300, "Dmitri Maziuk" 
>> <dmitri.maz...@gmail.com <mailto:dmitri.maz...@gmail.com>> wrote:
>> 
>> On 2017-07-24 07:51, Tomer Azran wrote:
>> > We don't have the ability to use it.
>> > Is that the only solution?
>>  
>> No, but I'd recommend thinking about it first. Are you sure you will 
>> care about your cluster working when your server room is on fire? 'Cause 
>> unless you have halon suppression, your server room is a complete 
>> write-off anyway. (Think water from sprinklers hitting rich chunky volts 
>> in the servers.)
>>  
>> Dima
>>  
>> _______________________________________________
>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org>
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>> <http://lists.clusterlabs.org/mailman/listinfo/users>
>>  
>> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/>
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/>
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org>
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>> <http://lists.clusterlabs.org/mailman/listinfo/users>
>>  
>> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/>
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/>
>>  
>> -- 
>> Klaus Wenninger
>>  
>> Senior Software Engineer, EMEA ENG Openstack Infrastructure
>>  
>> Red Hat
>>  
>> kwenn...@redhat.com <mailto:kwenn...@redhat.com>   
>> _______________________________________________
>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org>
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>> <http://lists.clusterlabs.org/mailman/listinfo/users>
>> 
>> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/>
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/>
>>  
>> 
>> 
>> 
>> _______________________________________________
>> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org>
>> http://lists.clusterlabs.org/mailman/listinfo/users 
>> <http://lists.clusterlabs.org/mailman/listinfo/users>
>>  
>> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/>
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
>> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
>> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/>
>>   
> 
> _______________________________________________
> Users mailing list: Users@clusterlabs.org <mailto:Users@clusterlabs.org>
> http://lists.clusterlabs.org/mailman/listinfo/users 
> <http://lists.clusterlabs.org/mailman/listinfo/users>
> 
> Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/>
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
> <http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf>
> Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/>

_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Re: [ClusterLabs] Two nodes cluster issue

Reply via email to