On Tue, 2023-03-28 at 13:11 +0800, d tbsky wrote: > Ken Gaillot <kgail...@redhat.com> > > I'm glad it's resolved, but for future reference, that does > > indicate a > > serious problem. It means the fencer is not accepting any requests, > > so > > any fencing attempts or even attempts to monitor a fencing device > > from > > that node will fail. > > > > That sounds like pacemaker-fenced became some kind of zombie. > For testing, I block the connection between the node and ipmi-fencing > device. the fencing resource stopped and report error like below: > > Failed Resource Actions: > * fence_ipmi start on c1.example.tw could not be executed (Timed > Out) because 'Fence agent did not complete in time' at Tue Mar 28 > 12:49:58 2023 after 20.004s > > and it recovered when the connection recovered. > Does it mean fencing is still working? > I want to make sure if I saw message like "pacemaker-fenced[2405] is > unresponsive to ipc after 1 tries", does it mean permanent fail or > the > second try success so it no more complains. >
If successful client connections are shown later in the log, it's recovered and should not be a problem. Of course if fencing failed or timed out, the cluster will want to keep trying before recovering resources. -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/