On Sun, 1 Apr 2018 09:01:15 +0300 Andrei Borzenkov <arvidj...@gmail.com> wrote:
> 31.03.2018 23:29, Jehan-Guillaume de Rorthais пишет: > > Hi all, > > > > I experienced a problem in a two node cluster. It has one FA per node and > > location constraints to avoid the node each of them are supposed to > > interrupt. > > If you mean stonith resource - for all I know location it does not > affect stonith operations and only changes where monitoring action is > performed. Sure. > You can create two stonith resources and declare that each > can fence only single node, but that is not location constraint, it is > resource configuration. Showing your configuration would be helpflul to > avoid guessing. True, I should have done that. A conf worth thousands of words :) crm conf<<EOC primitive fence_vm_srv1 stonith:fence_virsh \ params pcmk_host_check="static-list" pcmk_host_list="srv1" \ ipaddr="192.168.2.1" login="<user>" \ identity_file="/root/.ssh/id_rsa" \ port="srv1-d8" action="off" \ op monitor interval=10s location fence_vm_srv1-avoids-srv1 fence_vm_srv1 -inf: srv1 primitive fence_vm_srv2 stonith:fence_virsh \ params pcmk_host_check="static-list" pcmk_host_list="srv2" \ ipaddr="192.168.2.1" login="<user>" \ identity_file="/root/.ssh/id_rsa" \ port="srv2-d8" action="off" \ op monitor interval=10s location fence_vm_srv2-avoids-srv2 fence_vm_srv2 -inf: srv2 EOC > > During some tests, a ms resource raised an error during the stop action on > > both nodes. So both nodes were supposed to be fenced. > > In two-node cluster you can set pcmk_delay_max so that both nodes do not > attempt fencing simultaneously. I'm not sure to understand the doc correctly in regard with this property. Does pcmk_delay_max delay the request itself or the execution of the request? In other words, is it: delay -> fence query -> fencing action or fence query -> delay -> fence action ? The first definition would solve this issue, but not the second. As I understand it, as soon as the fence query has been sent, the node status is "UNCLEAN (online)". > > The first node did, but no FA was then able to fence the second one. So the > > node stayed DC and was reported as "UNCLEAN (online)". > > > > We were able to fix the original ressource problem, but not to avoid the > > useless second node fencing. > > > > My questions are: > > > > 1. is it possible to cancel the fencing request > > 2. is it possible reset the node status to "online" ? > > Not that I'm aware of. Argh! ++ _______________________________________________ Users mailing list: Users@clusterlabs.org https://lists.clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org