Manually it starts ok, no problems: pcs resource debug-start apache --full (unpack_config) warning: Blind faith: not fencing unseen nodes Operation start for apache (systemd::httpd) returned: 'ok' (0)
On 20/02/2020 16:46, Strahil Nikolov wrote: > On February 20, 2020 12:49:43 PM GMT+02:00, Maverick <m...@sapo.pt> wrote: >>> You really need to debug the start & stop of tthe resource . >>> >>> Please try the debug procedure and provide the output: >>> https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures >>> >>> Best Regards, >>> Strahil Nikolov >> >> Hi, >> >> Correct me if i'm wrong, but i think that procedure doesn't work for >> systemd class resources, i don't know which OCF script is responsible >> for handling systemd class resources. >> >> Also crm command doesn't exist in RHEL/Fedora, i've seen the crm >> command >> only in SUSE. >> >> >> >> On 19/02/2020 19:23, Strahil Nikolov wrote: >>> On February 19, 2020 7:21:12 PM GMT+02:00, Maverick <m...@sapo.pt> >> wrote: >>>> How is it possible that pacemaker is reporting that takes 4.2 >> minutes >>>> (254930ms) to execute the start of httpd systemd unit? >>>> >>>> Feb 19 17:04:09 boss1 pacemaker-execd [1514] (log_execute) >>>> info: >>>> executing - rsc:apache action:start call_id:25 >>>> Feb 19 17:04:09 boss1 pacemaker-execd [1514] (systemd_unit_exec) >>>> >>>> debug: Performing asynchronous start op on systemd unit httpd named >>>> 'apache' >>>> Feb 19 17:04:09 boss1 pacemaker-execd [1514] >>>> (systemd_unit_exec_with_unit) debug: Calling StartUnit for >> apache: >>>> /org/freedesktop/systemd1/unit/httpd_2eservice >>>> Feb 19 17:04:10 boss1 pacemaker-execd [1514] (action_complete) >> >>>> notice: Giving up on apache start (rc=0): timeout (elapsed=254930ms, >>>> remaining=-154930ms) >>>> Feb 19 17:04:10 boss1 pacemaker-execd [1514] (log_finished) >>>> debug: finished - rsc:apache action:monitor call_id:25 >> exit-code:198 >>>> exec-time:254935ms queue-time:235ms >>>> >>>> >>>> Starting manually works fine and fast: >>>> >>>> # time systemctl start httpd >>>> real 0m0.144s >>>> user 0m0.005s >>>> sys 0m0.008s >>>> >>>> >>>> On 17/02/2020 22:47, Mvrk wrote: >>>>> In attachment the pacemaker.log. On the log i can see that the >>>> cluster >>>>> tries to start, the start fails, then tries to stop, and the stop >>>> also >>>>> fails also. >>>>> >>>>> One more thing, my cluster was working fine on Fedora 28, i started >>>>> having this problem after upgrade to Fedora 31. >>>>> >>>>> On 17/02/2020 21:30, Ricardo Esteves wrote: >>>>>> Hi, >>>>>> >>>>>> Yes, i also don't understand why is trying to stop them first. >>>>>> >>>>>> SELinux is disabled: >>>>>> >>>>>> # getenforce >>>>>> Disabled >>>>>> >>>>>> All systemd services controlled by the cluster are disabled from >>>>>> starting at boot: >>>>>> >>>>>> # systemctl is-enabled httpd >>>>>> disabled >>>>>> >>>>>> # systemctl is-enabled openvpn-server@01-server >>>>>> disabled >>>>>> >>>>>> >>>>>> On 17/02/2020 20:28, Ken Gaillot wrote: >>>>>>> On Mon, 2020-02-17 at 17:35 +0000, Maverick wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> When i start my cluster, most of my systemd resources won't >> start: >>>>>>>> Failed Resource Actions: >>>>>>>> * apache_stop_0 on boss1 'OCF_TIMEOUT' (198): call=82, >>>>>>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01 >>>>>>>> 01:00:54 +01:00', queued=29ms, exec=197799ms >>>>>>>> * openvpn_stop_0 on boss1 'OCF_TIMEOUT' (198): call=61, >>>>>>>> status='Timed Out', exitreason='', last-rc-change='1970-01-01 >>>>>>>> 01:00:54 +01:00', queued=1805ms, exec=198841ms >>>>>>> These show that attempts to stop failed, rather than start. >>>>>>> >>>>>>>> So everytime i reboot my node, i need to start the resources >>>> manually >>>>>>>> using systemd, for example: >>>>>>>> >>>>>>>> systemd start apache >>>>>>>> >>>>>>>> and then pcs resource cleanup >>>>>>>> >>>>>>>> Resources configuration: >>>>>>>> >>>>>>>> Clone: apache-clone >>>>>>>> Meta Attrs: maintenance=false >>>>>>>> Resource: apache (class=systemd type=httpd) >>>>>>>> Meta Attrs: maintenance=false >>>>>>>> Operations: monitor interval=60 timeout=100 (apache-monitor- >>>>>>>> interval-60) >>>>>>>> start interval=0s timeout=100 >>>> (apache-start-interval- >>>>>>>> 0s) >>>>>>>> stop interval=0s timeout=100 >>>> (apache-stop-interval-0s) >>>>>>>> >>>>>>>> Resource: openvpn (class=systemd type=openvpn-server@01-server) >>>>>>>> Meta Attrs: maintenance=false >>>>>>>> Operations: monitor interval=60 timeout=100 (openvpn-monitor- >>>>>>>> interval-60) >>>>>>>> start interval=0s timeout=100 >>>> (openvpn-start-interval- >>>>>>>> 0s) >>>>>>>> stop interval=0s timeout=100 >>>> (openvpn-stop-interval- >>>>>>>> 0s) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Btw, if i try a debug-start / debug-stop the mentioned resources >>>>>>>> start and stop ok. >>>>>>> Based on that, my first guess would be SELinux. Check the SELinux >>>> logs >>>>>>> for denials. >>>>>>> >>>>>>> Also, make sure your systemd services are not enabled in systemd >>>> itself >>>>>>> (e.g. via systemctl enable). Clustered systemd services should be >>>>>>> managed by the cluster only. >>>> _______________________________________________ >>>> Manage your subscription: >>>> https://lists.clusterlabs.org/mailman/listinfo/users >>>> >>>> ClusterLabs home: https://www.clusterlabs.org/ >>> You really need to debug the start & stop of tthe resource . >>> >>> Please try the debug procedure and provide the output: >>> https://wiki.clusterlabs.org/wiki/Debugging_Resource_Failures >>> >>> Best Regards, >>> Strahil Nikolov > Hi Maverick, > > > you can replace 'crm resource stop' with 'pcs resource disable'. > The rest is working, but sadly not for systemd. > > You can try to: > 'pcs resource debug-start <resource> --full' > Another approach is to: > 1. Copy service to /etc/systemd/system > 2. In '[service]' section add this: > Environment=SYSTEMD_LOG_LEVEL=debug > 3. Reload systemd: > systemctl daemon_reload > Note: I assume you got downtime for debugging the issue > 4. Use 'debug-start --full' > > Note: Don't forget to remove the debug, or your journal will get full. > > Best Regards, > Strahil Nikolov _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/