On Wed, 2020-02-19 at 18:21 +0100, Maverick wrote: > How is it possible that pacemaker is reporting that takes 4.2 minutes > (254930ms) to execute the start of httpd systemd unit?
Sorry I didn't get a chance to look into this sooner. Fedora 31 introduced a change where the ftime() call that pacemaker had been using for operation timing was no longer available. We implemented clock_gettime()-based timing in a rush because it happened right before the release of 2.0.3. We enabled that code only for systems like Fedora 31 that didn't support ftime(). The clock_gettime()-based code turned out to have a bug that was recently fixed. The fixes will be in 2.0.4 (the first release candidate should come out in a couple of weeks) which will then be packaged for Fedora 31 and 32. > Feb 19 17:04:09 boss1 pacemaker-execd [1514] (log_execute) > info: > executing - rsc:apache action:start call_id:25 > Feb 19 17:04:09 boss1 pacemaker-execd [1514] (systemd_unit_exec) > > debug: Performing asynchronous start op on systemd unit httpd named > 'apache' > Feb 19 17:04:09 boss1 pacemaker-execd [1514] > (systemd_unit_exec_with_unit) debug: Calling StartUnit for > apache: > /org/freedesktop/systemd1/unit/httpd_2eservice > Feb 19 17:04:10 boss1 pacemaker-execd [1514] (action_complete) > > notice: Giving up on apache start (rc=0): timeout (elapsed=254930ms, > remaining=-154930ms) > Feb 19 17:04:10 boss1 pacemaker-execd [1514] (log_finished) > debug: finished - rsc:apache action:monitor call_id:25 exit-code:198 > exec-time:254935ms queue-time:235ms > > > Starting manually works fine and fast: > > # time systemctl start httpd > real 0m0.144s > user 0m0.005s > sys 0m0.008s > > > On 17/02/2020 22:47, Mvrk wrote: > > In attachment the pacemaker.log. On the log i can see that the > > cluster > > tries to start, the start fails, then tries to stop, and the stop > > also > > fails also. > > > > One more thing, my cluster was working fine on Fedora 28, i started > > having this problem after upgrade to Fedora 31. > > > > On 17/02/2020 21:30, Ricardo Esteves wrote: > > > Hi, > > > > > > Yes, i also don't understand why is trying to stop them first. > > > > > > SELinux is disabled: > > > > > > # getenforce > > > Disabled > > > > > > All systemd services controlled by the cluster are disabled from > > > starting at boot: > > > > > > # systemctl is-enabled httpd > > > disabled > > > > > > # systemctl is-enabled openvpn-server@01-server > > > disabled > > > > > > > > > On 17/02/2020 20:28, Ken Gaillot wrote: > > > > On Mon, 2020-02-17 at 17:35 +0000, Maverick wrote: > > > > > Hi, > > > > > > > > > > When i start my cluster, most of my systemd resources won't > > > > > start: > > > > > > > > > > Failed Resource Actions: > > > > > * apache_stop_0 on boss1 'OCF_TIMEOUT' (198): call=82, > > > > > status='Timed Out', exitreason='', last-rc-change='1970-01-01 > > > > > 01:00:54 +01:00', queued=29ms, exec=197799ms > > > > > * openvpn_stop_0 on boss1 'OCF_TIMEOUT' (198): call=61, > > > > > status='Timed Out', exitreason='', last-rc-change='1970-01-01 > > > > > 01:00:54 +01:00', queued=1805ms, exec=198841ms > > > > > > > > These show that attempts to stop failed, rather than start. > > > > > > > > > So everytime i reboot my node, i need to start the resources > > > > > manually > > > > > using systemd, for example: > > > > > > > > > > systemd start apache > > > > > > > > > > and then pcs resource cleanup > > > > > > > > > > Resources configuration: > > > > > > > > > > Clone: apache-clone > > > > > Meta Attrs: maintenance=false > > > > > Resource: apache (class=systemd type=httpd) > > > > > Meta Attrs: maintenance=false > > > > > Operations: monitor interval=60 timeout=100 (apache- > > > > > monitor- > > > > > interval-60) > > > > > start interval=0s timeout=100 (apache-start- > > > > > interval- > > > > > 0s) > > > > > stop interval=0s timeout=100 (apache-stop- > > > > > interval-0s) > > > > > > > > > > > > > > > > > > > > Resource: openvpn (class=systemd > > > > > type=openvpn-server@01-server) > > > > > Meta Attrs: maintenance=false > > > > > Operations: monitor interval=60 timeout=100 (openvpn- > > > > > monitor- > > > > > interval-60) > > > > > start interval=0s timeout=100 (openvpn-start- > > > > > interval- > > > > > 0s) > > > > > stop interval=0s timeout=100 (openvpn-stop- > > > > > interval- > > > > > 0s) > > > > > > > > > > > > > > > > > > > > Btw, if i try a debug-start / debug-stop the mentioned > > > > > resources > > > > > start and stop ok. > > > > > > > > Based on that, my first guess would be SELinux. Check the > > > > SELinux logs > > > > for denials. > > > > > > > > Also, make sure your systemd services are not enabled in > > > > systemd itself > > > > (e.g. via systemctl enable). Clustered systemd services should > > > > be > > > > managed by the cluster only. > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Ken Gaillot <kgail...@redhat.com> _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/