>>> Howard <hmon...@gmail.com> schrieb am 19.06.2020 um 00:13 in Nachricht <cao51vj7rnijz60kakmcgqztoubjbwgkakdqdnqsvh32+jdb...@mail.gmail.com>: > Thanks for all the help so far. With your assistance, I'm very close to > stable. > > Made the following changes to the vmfence stonith resource: > > Meta Attrs: failure-timeout=30m migration-threshold=10 > Operations: monitor interval=60s (vmfence-monitor-interval-60s) > > If I understand this correctly, it will check if the fencing device is > online every 60 seconds. It will try 10 times and then mark the node > ineligible. After 30 minutes it will start trying again.
Did you add "meta failure-timeout=30m" to the stonith resource? Maybe you could also set the stonith timeout to a higher value, the threshold to a lower value (like 3), and also the failure-timeout to a higher value (like several hours or days). (The idea is that if you have like one failure every second day you don't want the resocre to be disabled after a week or two, because the failure count accumulated) Of course while testing you may use lower values for the impatient ;-) Regards, Ulrich > > On Thu, Jun 18, 2020 at 12:29 PM Ken Gaillot <kgail...@redhat.com> wrote: > >> On Thu, 2020-06-18 at 21:32 +0300, Andrei Borzenkov wrote: >> > 18.06.2020 18:24, Ken Gaillot пишет: >> > > Note that a failed start of a stonith device will not prevent the >> > > cluster from using that device for fencing. It just prevents the >> > > cluster from monitoring the device. >> > > >> > >> > My understanding is that if stonith resource cannot run anywhere, it >> > also won't be used for stonith. When failcount exceeds threshold, >> > resource is banned from node. If it happens on all nodes, resource >> > cannot run anywhere and so won't be used for stonith. Start failure >> > automatically sets failcount to INFINITY. >> > >> > Or do I misunderstand something? >> >> I had to test to confirm, but a stonith resource stopped due to >> failures can indeed be used. Only stonith resources stopped via >> location constraints (bans) or target-role=Stopped are prevented from >> being used. >> -- >> Ken Gaillot <kgail...@redhat.com> >> >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ >> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/