> You really need to debug the start & stop of tthe resource . > > Please try the debug procedure and provide the output: > https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures > > Best Regards, > Strahil Nikolov
Hi, Correct me if i'm wrong, but i think that procedure doesn't work for systemd class resources, i don't know which OCF script is responsible for handling systemd class resources. Also crm command doesn't exist in RHEL/Fedora, i've seen the crm command only in SUSE. On 19/02/2020 19:23, Strahil Nikolov wrote: > On February 19, 2020 7:21:12 PM GMT+02:00, Maverick <m...@sapo.pt> wrote: >> How is it possible that pacemaker is reporting that takes 4.2 minutes >> (254930ms) to execute the start of httpd systemd unit? >> >> Feb 19 17:04:09 boss1 pacemaker-execd [1514] (log_execute) >> info: >> executing - rsc:apache action:start call_id:25 >> Feb 19 17:04:09 boss1 pacemaker-execd [1514] (systemd_unit_exec) >> >> debug: Performing asynchronous start op on systemd unit httpd named >> 'apache' >> Feb 19 17:04:09 boss1 pacemaker-execd [1514] >> (systemd_unit_exec_with_unit) debug: Calling StartUnit for apache: >> /org/freedesktop/systemd1/unit/httpd_2eservice >> Feb 19 17:04:10 boss1 pacemaker-execd [1514] (action_complete) >> notice: Giving up on apache start (rc=0): timeout (elapsed=254930ms, >> remaining=-154930ms) >> Feb 19 17:04:10 boss1 pacemaker-execd [1514] (log_finished) >> debug: finished - rsc:apache action:monitor call_id:25 exit-code:198 >> exec-time:254935ms queue-time:235ms >> >> >> Starting manually works fine and fast: >> >> # time systemctl start httpd >> real 0m0.144s >> user 0m0.005s >> sys 0m0.008s >> >> >> On 17/02/2020 22:47, Mvrk wrote: >>> In attachment the pacemaker.log. On the log i can see that the >> cluster >>> tries to start, the start fails, then tries to stop, and the stop >> also >>> fails also. >>> >>> One more thing, my cluster was working fine on Fedora 28, i started >>> having this problem after upgrade to Fedora 31. >>> >>> On 17/02/2020 21:30, Ricardo Esteves wrote: >>>> Hi, >>>> >>>> Yes, i also don't understand why is trying to stop them first. >>>> >>>> SELinux is disabled: >>>> >>>> # getenforce >>>> Disabled >>>> >>>> All systemd services controlled by the cluster are disabled from >>>> starting at boot: >>>> >>>> # systemctl is-enabled httpd >>>> disabled >>>> >>>> # systemctl is-enabled openvpn-server@01-server >>>> disabled >>>> >>>> >>>> On 17/02/2020 20:28, Ken Gaillot wrote: >>>>> On Mon, 2020-02-17 at 17:35 +0000, Maverick wrote: >>>>>> Hi, >>>>>> >>>>>> When i start my cluster, most of my systemd resources won't start: >>>>>> >>>>>> Failed Resource Actions: >>>>>> * apache_stop_0 on boss1 'OCF_TIMEOUT' (198): call=82, >>>>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01 >>>>>> 01:00:54 +01:00', queued=29ms, exec=197799ms >>>>>> * openvpn_stop_0 on boss1 'OCF_TIMEOUT' (198): call=61, >>>>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01 >>>>>> 01:00:54 +01:00', queued=1805ms, exec=198841ms >>>>> These show that attempts to stop failed, rather than start. >>>>> >>>>>> So everytime i reboot my node, i need to start the resources >> manually >>>>>> using systemd, for example: >>>>>> >>>>>> systemd start apache >>>>>> >>>>>> and then pcs resource cleanup >>>>>> >>>>>> Resources configuration: >>>>>> >>>>>> Clone: apache-clone >>>>>> Meta Attrs: maintenance=false >>>>>> Resource: apache (class=systemd type=httpd) >>>>>> Meta Attrs: maintenance=false >>>>>> Operations: monitor interval=60 timeout=100 (apache-monitor- >>>>>> interval-60) >>>>>> start interval=0s timeout=100 >> (apache-start-interval- >>>>>> 0s) >>>>>> stop interval=0s timeout=100 >> (apache-stop-interval-0s) >>>>>> >>>>>> >>>>>> Resource: openvpn (class=systemd type=openvpn-server@01-server) >>>>>> Meta Attrs: maintenance=false >>>>>> Operations: monitor interval=60 timeout=100 (openvpn-monitor- >>>>>> interval-60) >>>>>> start interval=0s timeout=100 >> (openvpn-start-interval- >>>>>> 0s) >>>>>> stop interval=0s timeout=100 >> (openvpn-stop-interval- >>>>>> 0s) >>>>>> >>>>>> >>>>>> >>>>>> Btw, if i try a debug-start / debug-stop the mentioned resources >>>>>> start and stop ok. >>>>> Based on that, my first guess would be SELinux. Check the SELinux >> logs >>>>> for denials. >>>>> >>>>> Also, make sure your systemd services are not enabled in systemd >> itself >>>>> (e.g. via systemctl enable). Clustered systemd services should be >>>>> managed by the cluster only. >> _______________________________________________ >> Manage your subscription: >> https://lists.clusterlabs.org/mailman/listinfo/users >> >> ClusterLabs home: https://www.clusterlabs.org/ > You really need to debug the start & stop of tthe resource . > > Please try the debug procedure and provide the output: > https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures > > Best Regards, > Strahil Nikolov _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/