You can setup the system in such case that on fabric fence, the node is 
rebooted which will allow it to 'unfence' itself afterwards.
For details check https://access.redhat.com/solutions/3367151 or  
https://access.redhat.com/node/65187 (You may use RH developer subscription in 
order to acess it).

It seems that fence_mpath has watchdog integration after a certain version, 
while you can still use /usr/share/cluster/fence_mpath_check (via watchdog 
service and supported watchdog device). Even if you don't have a proper 
watchdog device, you can use the 'softdog' module as the system is fenced via 
SAN and even if not rebooted , there is no risk .

Best Regards,Strahil Nikolov

Sent from Yahoo Mail on Android 
 
  On Sat, Aug 28, 2021 at 10:14, Andrei Borzenkov<arvidj...@gmail.com> wrote:   
On Fri, Aug 27, 2021 at 8:11 PM Gerry R Sommerville <ge...@ca.ibm.com> wrote:
>
> Hey all,
>
> From what I see in the documentation for fabric fencing, Pacemaker requires 
> an administrator to login to the node to manually start and unfence the node 
> after some failure.
>  
>https://clusterlabs.org/pacemaker/doc/deprecated/en-US/Pacemaker/2.0/html/Pacemaker_Explained/s-unfencing.html
>

This is about fabric (or resource) fencing. In this case node is cut
off from some vital resources but remains up and running. In this case
someone indeed needs to intervene manually.

> The concern I have is if there is an intermittent network issues, a node may 
> get fenced and we have to wait for someone to log into the cluster and bring 
> the node back online. Meanwhile the network issue may have resolved itself 
> shortly after the node was fenced.
>
> I wonder if there are any configurations or popular solutions that people use 
> to automatically unfence nodes and have them rejoin the cluster?
>

Most people use stonith (or node fencing) and affected node is
rebooted. As long as pacemaker is configured to start automatically
and network connectivity is restored after reboot node will join
custer automatically.

I think that in case of fabric fencing node is undefnced automatically
when it reboots and attempts to join cluster (hopefully someone may
chime in here). I am not sure what happens if node is not rebooted but
connectivity is restored.
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/
  
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users

ClusterLabs home: https://www.clusterlabs.org/

Reply via email to