On 1/23/23 19:05, Reid Wahl wrote:
On Mon, Jan 23, 2023 at 9:59 AM Roberto Ferrari <rferr...@mbigroup.it> wrote:
On 23/01/23 18:25, Reid Wahl wrote:
On Mon, Jan 23, 2023 at 7:51 AM Roberto Ferrari <rferr...@mbigroup.it> wrote:
Hello everybody,
I'd like to understand a strange behavior of a cluster of mine with,
basically, some IPAddr resource and a systemd resource that deals with
netfilter-persistent.
Here the configuration:
primitive FW-VIP-Outside IPaddr2 \
params ip=192.168.26.74 cidr_netmask=24 nic=outside arp_bg=true \
op monitor interval=20s timeout=20s
primitive FW-VIP-Private IPaddr2 \
params ip=192.168.104.100 cidr_netmask=24 nic=private arp_bg=true \
op monitor interval=20s timeout=20s
primitive Netfilter systemd:netfilter-persistent \
op start interval=0 timeout=60 \
op stop interval=0 timeout=60
group FW-VIPs FW-VIP-Private FW-VIP-Outside Netfilter
The active node, when I reboot the server, hangs shutting down for many
minutes writing:
A stop job is running for Pacemaker High Availability Cluster Manager (
11 s / 30 min). (where 11 is the number of seconds already passed)
Obviously switching to another master is immediate and performing
syetmctl stop netfilter-persistent is immediate too.
Do you have any hint on what goes wrong with this? I cannot find
anything strange in the logs.
Thanks a lot,
Roberto.
Is the netfilter systemd unit enabled outside pacemaker? Run
`systemctl is-enabled netfilter-persistent` to find out, and run
`systemctl disable netfilter-persistent` to disable it if it's
enabled. Only Pacemaker should start or stop netfilter.
--
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
--
Regards,
Reid Wahl (He/Him)
Senior Software Engineer, Red Hat
RHEL High Availability - Pacemaker
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
Thank's a lot Reid,
Unfortunately it wasn't my case, netfilter-persistent seemed to be
disabled at boot.
Cheers,
R.
Can you share the pacemaker logs from the shutdown period? That will
probably give some idea of what it's waiting on.
--
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/
Here you are:
Jan 23 09:21:33 usab-fe2 pacemaker-controld[1327]: notice: Result of
start operation for Netfilter on usab-fe2: 0 (ok)
Jan 23 09:27:45 usab-fe2 pacemakerd[1296]: notice: Caught 'Terminated'
signal
Jan 23 09:27:45 usab-fe2 pacemakerd[1296]: notice: Shutting down Pacemaker
Jan 23 09:27:45 usab-fe2 systemd[1]: Stopping Pacemaker High
Availability Cluster Manager...
Jan 23 09:27:45 usab-fe2 pacemakerd[1296]: notice: Stopping
pacemaker-controld
Jan 23 09:27:45 usab-fe2 pacemaker-controld[1327]: notice: Caught
'Terminated' signal
Jan 23 09:27:45 usab-fe2 pacemaker-controld[1327]: notice: Shutting
down cluster resource manager
Jan 23 09:27:45 usab-fe2 pacemaker-attrd[1325]: notice: Setting
shutdown[usab-fe2]: (unset) -> 1674466065
Jan 23 09:28:45 usab-fe2 pacemaker-execd[1324]: notice: Giving up on
Netfilter stop (rc=0): timeout (elapsed=59991ms, remaining=9ms)
Jan 23 09:28:45 usab-fe2 pacemaker-controld[1327]: error: Result of
stop operation for Netfilter on usab-fe2: Timed Out
Jan 23 09:28:45 usab-fe2 pacemaker-attrd[1325]: notice: Setting
fail-count-Netfilter#stop_0[usab-fe2]: (unset) -> INFINITY
Jan 23 09:28:45 usab-fe2 pacemaker-attrd[1325]: notice: Setting
last-failure-Netfilter#stop_0[usab-fe2]: (unset) -> 1674466125
Jan 23 09:47:45 usab-fe2 pacemaker-controld[1327]: error: Shutdown
Escalation just popped in state S_NOT_DC!
Jan 23 09:47:45 usab-fe2 pacemaker-controld[1327]: notice: State
transition S_NOT_DC -> S_STOPPING
Jan 23 09:47:45 usab-fe2 pacemaker-controld[1327]: notice: Stopped 0
recurring operations at shutdown... waiting (2 remaining)
Jan 23 09:47:45 usab-fe2 pacemaker-controld[1327]: notice: Recurring
action FW-VIP-Private:64 (FW-VIP-Private_monitor_20000) incomplete at
shutdown
Jan 23 09:47:45 usab-fe2 pacemaker-controld[1327]: notice: Recurring
action FW-VIP-Outside:66 (FW-VIP-Outside_monitor_20000) incomplete at
shutdown
Jan 23 09:47:45 usab-fe2 pacemaker-controld[1327]: error: 3 resources
were active at shutdown
Jan 23 09:47:45 usab-fe2 pacemaker-controld[1327]: notice: Disconnected
from the executor
Jan 23 09:47:45 usab-fe2 pacemaker-controld[1327]: notice: Disconnected
from Corosync
Jan 23 09:47:45 usab-fe2 pacemaker-controld[1327]: notice: Disconnected
from the CIB manager
Jan 23 09:47:45 usab-fe2 systemd[1]: pacemaker.service: Succeeded.
Jan 23 09:47:45 usab-fe2 systemd[1]: Stopped Pacemaker High Availability
Cluster Manager.
-- Reboot --
Jan 23 09:49:34 usab-fe2 systemd[1]: Started Pacemaker High Availability
Cluster Manager.
Thank's,
R.
--
_______________________________________________
Manage your subscription:
https://lists.clusterlabs.org/mailman/listinfo/users
ClusterLabs home: https://www.clusterlabs.org/