How is it possible that pacemaker is reporting that takes 4.2 minutes (254930ms) to execute the start of httpd systemd unit?
Feb 19 17:04:09 boss1 pacemaker-execd [1514] (log_execute) info: executing - rsc:apache action:start call_id:25 Feb 19 17:04:09 boss1 pacemaker-execd [1514] (systemd_unit_exec) debug: Performing asynchronous start op on systemd unit httpd named 'apache' Feb 19 17:04:09 boss1 pacemaker-execd [1514] (systemd_unit_exec_with_unit) debug: Calling StartUnit for apache: /org/freedesktop/systemd1/unit/httpd_2eservice Feb 19 17:04:10 boss1 pacemaker-execd [1514] (action_complete) notice: Giving up on apache start (rc=0): timeout (elapsed=254930ms, remaining=-154930ms) Feb 19 17:04:10 boss1 pacemaker-execd [1514] (log_finished) debug: finished - rsc:apache action:monitor call_id:25 exit-code:198 exec-time:254935ms queue-time:235ms Starting manually works fine and fast: # time systemctl start httpd real 0m0.144s user 0m0.005s sys 0m0.008s On 17/02/2020 22:47, Mvrk wrote: > In attachment the pacemaker.log. On the log i can see that the cluster > tries to start, the start fails, then tries to stop, and the stop also > fails also. > > One more thing, my cluster was working fine on Fedora 28, i started > having this problem after upgrade to Fedora 31. > > On 17/02/2020 21:30, Ricardo Esteves wrote: >> Hi, >> >> Yes, i also don't understand why is trying to stop them first. >> >> SELinux is disabled: >> >> # getenforce >> Disabled >> >> All systemd services controlled by the cluster are disabled from >> starting at boot: >> >> # systemctl is-enabled httpd >> disabled >> >> # systemctl is-enabled openvpn-server@01-server >> disabled >> >> >> On 17/02/2020 20:28, Ken Gaillot wrote: >>> On Mon, 2020-02-17 at 17:35 +0000, Maverick wrote: >>>> Hi, >>>> >>>> When i start my cluster, most of my systemd resources won't start: >>>> >>>> Failed Resource Actions: >>>> * apache_stop_0 on boss1 'OCF_TIMEOUT' (198): call=82, >>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01 >>>> 01:00:54 +01:00', queued=29ms, exec=197799ms >>>> * openvpn_stop_0 on boss1 'OCF_TIMEOUT' (198): call=61, >>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01 >>>> 01:00:54 +01:00', queued=1805ms, exec=198841ms >>> These show that attempts to stop failed, rather than start. >>> >>>> So everytime i reboot my node, i need to start the resources manually >>>> using systemd, for example: >>>> >>>> systemd start apache >>>> >>>> and then pcs resource cleanup >>>> >>>> Resources configuration: >>>> >>>> Clone: apache-clone >>>> Meta Attrs: maintenance=false >>>> Resource: apache (class=systemd type=httpd) >>>> Meta Attrs: maintenance=false >>>> Operations: monitor interval=60 timeout=100 (apache-monitor- >>>> interval-60) >>>> start interval=0s timeout=100 (apache-start-interval- >>>> 0s) >>>> stop interval=0s timeout=100 (apache-stop-interval-0s) >>>> >>>> >>>> >>>> Resource: openvpn (class=systemd type=openvpn-server@01-server) >>>> Meta Attrs: maintenance=false >>>> Operations: monitor interval=60 timeout=100 (openvpn-monitor- >>>> interval-60) >>>> start interval=0s timeout=100 (openvpn-start-interval- >>>> 0s) >>>> stop interval=0s timeout=100 (openvpn-stop-interval- >>>> 0s) >>>> >>>> >>>> >>>> Btw, if i try a debug-start / debug-stop the mentioned resources >>>> start and stop ok. >>> Based on that, my first guess would be SELinux. Check the SELinux logs >>> for denials. >>> >>> Also, make sure your systemd services are not enabled in systemd itself >>> (e.g. via systemctl enable). Clustered systemd services should be >>> managed by the cluster only. _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/