>>> Antony Stone <antony.st...@ha.open.source.it> schrieb am 21.12.2022 um 17:19 in Nachricht <202212211719.34369.antony.st...@ha.open.source.it>: > On Wednesday 21 December 2022 at 16:59:16, Antony Stone wrote: > >> Hi. >> >> I'm implementing fencing on a 7‑node cluster as described recently: >> https://lists.clusterlabs.org/pipermail/users/2022‑December/030714.html >> >> I'm using external/ssh for the time being, and it works if I test it using: >> >> stonith ‑t external/ssh ‑p "nodeA nodeB nodeC" ‑T reset nodeB >> >> >> However, when it's supposed to be invoked because a node has got stuck, I >> simply find syslog full of the following (one from each of the other six >> nodes in the cluster): >> >> pacemaker‑fenced[3262]: notice: Operation reboot of nodeB by <no‑one> for >> pacemaker‑controld.26852@nodeA.93b391b2: No such device >> >> I have defined seven stonith resources, one for rebooting each machine, and >> I can see from "crm status" that they have been assigned randomly amongst >> the other servers, usually one per server, so that looks good. >> >> >> The main things that puzzle me about the log message are: >> >> a) why does it say "<no‑one>"? Is this more like "anyone", meaning that >> no‑ one in particular is required to do this task, provided that at least >> someone does it? Does this indicate a configuration problem? > > PS: I've just noticed that I'm also getting log entries immediately > afterwards: > > pacemaker‑controld[3264]: notice: Peer nodeB was not terminated (reboot) by > > <anyone> on behalf of pacemaker‑controld.26852: No such device > >> b) what is this "device" referred to? I'm using "external/ssh" so there is >> no actual Stonith device for power‑cycling hardware machines ‑ am I >> supposed to define some sort of dummy device somewhere? >> >> For clarity, this is what I have added to my cluster configuration to set >> this up: >> >> primitive reboot_nodeA stonith:external/ssh params hostlist="nodeA" >> location only_nodeA reboot_nodeA ‑inf: nodeA
"location only_nodeA" meaning "location not_nodeA"? ;-) >> >> ...repeated for all seven nodes. >> >> I also have "stonith‑enabled=yes" in the cib‑bootstrap‑options. >> >> >> Ideas, anyone? >> >> Thanks, >> >> >> Antony. > > ‑‑ > Normal people think "If it ain't broke, don't fix it". > Engineers think "If it ain't broke, it doesn't have enough features yet". > > Please reply to the list; > please *don't* CC > me. > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/