My understanding is that  SBD will need a shared storage between clustered 
nodes.

And that, SBD will need at least 3 nodes in a cluster, if using w/o shared 
storage.

 

Therefore, for systems which do NOT use shared storage between 1+1 HA clustered 
nodes, SBD may NOT be an option.

Correct me, if I am wrong.

 

For cluster systems using the likes of iDRAC/IMM2 fencing agents, which have 
redundant but shared power supply units with the nodes, the normal fencing 
mechanisms should work for all resiliency scenarios, but for IMM2/iDRAC are 
being NOT reachable for whatsoever reasons. And, to bail out of those 
situations in the absence of SBD, I believe using used-defined failover hooks 
(via scripts) into Pacemaker Alerts, with sudo permissions for ‘hacluster’, 
should help.

 

Thanx.

 

 

From: Klaus Wenninger [mailto:kwenn...@redhat.com] 
Sent: Monday, July 24, 2017 11:31 PM
To: Cluster Labs - All topics related to open-source clustering welcomed; 
Prasad, Shashank
Subject: Re: [ClusterLabs] Two nodes cluster issue

 

On 07/24/2017 07:32 PM, Prasad, Shashank wrote:

        Sometimes IPMI fence devices use shared power of the node, and it 
cannot be avoided.

        In such scenarios the HA cluster is NOT able to handle the power 
failure of a node, since the power is shared with its own fence device.

        The failure of IPMI based fencing can also exist due to other reasons 
also.

         

        A failure to fence the failed node will cause cluster to be marked 
UNCLEAN.

        To get over it, the following command needs to be invoked on the 
surviving node.

         

        pcs stonith confirm <failed_node_name> --force

         

        This can be automated by hooking a recovery script, when the the 
Stonith resource ‘Timed Out’ event.

        To be more specific, the Pacemaker Alerts can be used for watch for 
Stonith timeouts and failures.

        In that script, all that’s essentially to be executed is the 
aforementioned command.


If I get you right here you can disable fencing then in the first place.
Actually quorum-based-watchdog-fencing is the way to do this in a
safe manner. This of course assumes you have a proper source for
quorum in your 2-node-setup with e.g. qdevice or using a shared
disk with sbd (not directly pacemaker quorum here but similar thing
handled inside sbd).




Since the alerts are issued from ‘hacluster’ login, sudo permissions for 
‘hacluster’ needs to be configured.

 

Thanx.

 

 

From: Klaus Wenninger [mailto:kwenn...@redhat.com] 
Sent: Monday, July 24, 2017 9:24 PM
To: Kristián Feldsam; Cluster Labs - All topics related to open-source 
clustering welcomed
Subject: Re: [ClusterLabs] Two nodes cluster issue

 

On 07/24/2017 05:37 PM, Kristián Feldsam wrote:

        I personally think that power off node by switched pdu is more safe, or 
not?


True if that is working in you environment. If you can't do a physical setup
where you aren't simultaneously loosing connection to both your node and
the switch-device (or you just want to cover cases where that happens)
you have to come up with something else.






S pozdravem Kristián Feldsam
Tel.: +420 773 303 353, +421 944 137 535
E-mail.: supp...@feldhost.cz

www.feldhost.cz - FeldHost™ – profesionální hostingové a serverové služby za 
adekvátní ceny.

FELDSAM s.r.o.
V rohu 434/3
Praha 4 – Libuš, PSČ 142 00
IČ: 290 60 958, DIČ: CZ290 60 958
C 200350 vedená u Městského soudu v Praze

Banka: Fio banka a.s.
Číslo účtu: 2400330446/2010
BIC: FIOBCZPPXX
IBAN: CZ82 2010 0000 0024 0033 0446 

 

        On 24 Jul 2017, at 17:27, Klaus Wenninger <kwenn...@redhat.com> wrote:

         

        On 07/24/2017 05:15 PM, Tomer Azran wrote:

                I still don't understand why the qdevice concept doesn't help 
on this situation. Since the master node is down, I would expect the quorum to 
declare it as dead.

                Why doesn't it happens?

        
        That is not how quorum works. It just limits the decision-making to the 
quorate subset of the cluster.
        Still the unknown nodes are not sure to be down.
        That is why I suggested to have quorum-based watchdog-fencing with sbd.
        That would assure that within a certain time all nodes of the 
non-quorate part
        of the cluster are down.
        
        
        
        

        
        
        
        

        On Mon, Jul 24, 2017 at 4:15 PM +0300, "Dmitri Maziuk" 
<dmitri.maz...@gmail.com> wrote:

        On 2017-07-24 07:51, Tomer Azran wrote:
        > We don't have the ability to use it.
        > Is that the only solution?
         
        No, but I'd recommend thinking about it first. Are you sure you will 
        care about your cluster working when your server room is on fire? 
'Cause 
        unless you have halon suppression, your server room is a complete 
        write-off anyway. (Think water from sprinklers hitting rich chunky 
volts 
        in the servers.)
         
        Dima
         
        _______________________________________________
        Users mailing list: Users@clusterlabs.org
        http://lists.clusterlabs.org/mailman/listinfo/users
         
        Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> 
        Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
        Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> 

        
        
        
        
        

        _______________________________________________
        Users mailing list: Users@clusterlabs.org
        http://lists.clusterlabs.org/mailman/listinfo/users
         
        Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> 
        Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
        Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> 

         

        -- 
        Klaus Wenninger
         
        Senior Software Engineer, EMEA ENG Openstack Infrastructure
         
        Red Hat
         
        kwenn...@redhat.com   

        _______________________________________________
        Users mailing list: Users@clusterlabs.org 
<mailto:Users@clusterlabs.org> 
        http://lists.clusterlabs.org/mailman/listinfo/users 
<http://lists.clusterlabs.org/mailman/listinfo/users> 
        
        Project Home: http://www.clusterlabs.org <http://www.clusterlabs.org/> 
        Getting started: 
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf 
<http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf> 
        Bugs: http://bugs.clusterlabs.org <http://bugs.clusterlabs.org/> 

 






_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users
 
Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org
  
_______________________________________________
Users mailing list: Users@clusterlabs.org
http://lists.clusterlabs.org/mailman/listinfo/users

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Reply via email to