On 06/14/2016 03:10 AM, Jeremy Voisin wrote: > Hi all, > > > > We actually have a 2 nodes cluster with corosync and pacemaker for > httpd. We have 2 VIP configured. > > > > Since we’ve added ModSecurity 2.9, httpd restart is very slow. So I > increased the start / stop timeout. But sometimes, after logrotate the > following error occurs : > > > > Failed Actions: > > * WebSite_monitor_300000 on node1 'not running' (7): call=26, > status=complete, exitreason='none', > > last-rc-change='Tue Jun 14 03:43:05 2016', queued=0ms, exec=0ms > > > > Here is the full output of crm_mon : > > Last updated: Tue Jun 14 07:22:28 2016 Last change: Fri Jun 10 > 09:28:03 2016 by root via cibadmin on node1 > > Stack: corosync > > Current DC: node1 (version 1.1.13-10.el7_2.2-44eb2dd) - partition with > quorum > > 2 nodes and 4 resources configured > > > > Online: [ node1 node2 ] > > > > WebSite (systemd:httpd): Started node1 > > Resource Group: WAFCluster > > VirtualIP (ocf::heartbeat:IPaddr2): Started node1 > > MailMon (ocf::heartbeat:MailTo): Started node1 > > VirtualIP2 (ocf::heartbeat:IPaddr2): Started node1 > > > > Failed Actions: > > * WebSite_monitor_300000 on node1 'not running' (7): call=26, > status=complete, exitreason='none', > > last-rc-change='Tue Jun 14 03:43:05 2016', queued=0ms, exec=0ms > > > > # pcs resource --full > > Resource: WebSite (class=systemd type=httpd) > > Attributes: configfile=/etc/httpd/conf/httpd.conf > statusurl=http://127.0.0.1/server-status monitor=1min > > Operations: monitor interval=300s (WebSite-monitor-interval-300s) > > start interval=0s timeout=300s (WebSite-start-interval-0s) > > stop interval=0s timeout=300s (WebSite-stop-interval-0s) > > Group: WAFCluster > > Resource: VirtualIP (class=ocf provider=heartbeat type=IPaddr2) > > Attributes: ip=195.70.7.74 cidr_netmask=27 > > Operations: start interval=0s timeout=20s (VirtualIP-start-interval-0s) > > stop interval=0s timeout=20s (VirtualIP-stop-interval-0s) > > monitor interval=30s (VirtualIP-monitor-interval-30s) > > Resource: MailMon (class=ocf provider=heartbeat type=MailTo) > > Attributes: email=sys...@dfi.ch > > Operations: start interval=0s timeout=10 (MailMon-start-interval-0s) > > stop interval=0s timeout=10 (MailMon-stop-interval-0s) > > monitor interval=10 timeout=10 (MailMon-monitor-interval-10) > > Resource: VirtualIP2 (class=ocf provider=heartbeat type=IPaddr2) > > Attributes: ip=195.70.7.75 cidr_netmask=27 > > Operations: start interval=0s timeout=20s (VirtualIP2-start-interval-0s) > > stop interval=0s timeout=20s (VirtualIP2-stop-interval-0s) > > monitor interval=30s (VirtualIP2-monitor-interval-30s) > > > > > > If I run /crm_resource –P/ the Failed Actions disappear. > > > > How can I fix the monitor “not running” error ? > > > > Thanks, > > Jérémy
Why does logrotate cause the site to stop responding? Normally it's a graceful restart, which shouldn't cause any interruptions. Any solution will have to be in logrotate, to keep it from interrupting service. Personally, my preferred configuration is to make apache log to syslog instead of its usual log file. You can even configure syslog to log it to the usual file, so there's no major difference. Then, you don't need a separate logrotate script for apache, it gets rotated with the system log. That avoids having to restart apache, which for a busy site can be a big deal. It also gives you the option of tying into syslog tools such as remote logging. _______________________________________________ Users mailing list: Users@clusterlabs.org http://clusterlabs.org/mailman/listinfo/users Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org