Hi Andrew, Yes, I tried with both "/etc/init.d/apache2 start" and "service apache2 start".
I suppose I have to test with the path defined in crm configure show, which might be something like /usr/bin/apache2 or something like that (I am not home just now I can't check..) I will have a look. Thanks very much for your help. G On 19 September 2011 00:41, Andrew Beekhof <and...@beekhof.net> wrote: > On Fri, Sep 16, 2011 at 11:22 PM, Guillaume Bettayeb > <guillaume1...@gmail.com> wrote: > > Hi all, > > > > I have been through my Apache configuration again and I confirm Apache > works > > fine. > > I assume you're testing by running "/etc/init.d/apache2 start" or > something similar? > This is not what the cluster executes to start apache and therefor the > test doesn't help much. > > > > > I have changed corosync config file to dump all the corosync log into > > /var/log/corosync/corosync.log > > > > then I have restarted corosync and the log file has the following : > > > > http://pastebin.com/BFVVfxCh > > > > Could the following lines being the consequence of the error ? > > A consequence yes, but not the cause. > > > > > root@node1:/var/log/corosync# cat corosync-apache.log | grep "INFINITY > > times" > > Sep 16 14:11:13 node1 pengine: [7103]: info: get_failcount: apache has > > failed INFINITY times on node1 > > Sep 16 14:11:15 node1 pengine: [7103]: info: get_failcount: apache has > > failed INFINITY times on node1 > > Sep 16 14:11:16 node1 pengine: [7103]: info: get_failcount: apache has > > failed INFINITY times on node1 > > Sep 16 14:11:18 node1 pengine: [7103]: info: get_failcount: apache has > > failed INFINITY times on node1 > > Sep 16 14:11:18 node1 pengine: [7103]: info: get_failcount: apache has > > failed INFINITY times on node2 > > Sep 16 14:11:18 node1 pengine: [7103]: info: get_failcount: apache has > > failed INFINITY times on node1 > > Sep 16 14:11:18 node1 pengine: [7103]: info: get_failcount: apache has > > failed INFINITY times on node2 > > > > > > Thanks again, > > > > G > > > > On 16 September 2011 11:00, Guillaume Bettayeb <guillaume1...@gmail.com > >wrote: > > > >> Hi Dejan, > >> > >> I am not sure because Apache runs like a charm when not started via > >> Corosync but I don't know. > >> > >> Thanks, > >> > >> Guillaume > >> > >> > >> On 16 September 2011 09:01, Dejan Muhamedagic <deja...@fastmail.fm> > wrote: > >> > >>> Hi, > >>> > >>> On Fri, Sep 16, 2011 at 03:01:12AM +0100, Guillaume Bettayeb wrote: > >>> > Hi all, > >>> > > >>> > > >>> > I am still struggling to run apache in corosync. My Apache service is > OK > >>> and > >>> > runs fine if I start it manually, I have mod_status enabled on both > >>> nodes > >>> > OK. Ulrich made a good point earlier by asking if the cgisock ls -l > >>> > /var/run/apache2 was used by another process but that's not the case. > >>> > > >>> > I've restarted corosync after midnight and have added the full syslog > >>> here : > >>> > > >>> > http://pastebin.com/LXmLUu3W > >>> > > >>> > > >>> > I keep digging on google to find out why is it not working but any > help > >>> > would be greatly appreciated..is anyone else runs an Ubuntu cluster > with > >>> > Apache here by any chance ? > >>> > >>> Perhaps, but since this seems to be an issue with apache on > >>> Ubuntu I guess it's best to enquire in some ubuntu forum. > >>> > >>> Thanks, > >>> > >>> Dejan > >>> > >>> > Thanks, > >>> > > >>> > Guillaume > >>> > > >>> > > >>> > On 15 September 2011 16:04, Guillaume Bettayeb < > guillaume1...@gmail.com > >>> >wrote: > >>> > > >>> > > Hi Ulrich, > >>> > > > >>> > > nope, there's nothing in there at the moment : > >>> > > root@node1:/home/user# ls -l /var/run/apache2 > >>> > > ls: cannot access /var/run/apache2: No such file or directory > >>> > > > >>> > > > >>> > > it looks like that error comes up when the cluster starts apache. > >>> > > > >>> > > > >>> > > Guillaume > >>> > > > >>> > > On 15 September 2011 16:00, Ulrich Windl < > >>> > > ulrich.wi...@rz.uni-regensburg.de> wrote: > >>> > > > >>> > >> Hi! > >>> > >> > >>> > >> What about "ls -l /var/run/apache2"? Any cgisock* there? > Permissions > >>> of > >>> > >> the directory OK? Who is using that cgisock? > >>> > >> > >>> > >> Ulrich > >>> > >> > >>> > >> > >>> > >> >>> Guillaume Bettayeb <guillaume1...@gmail.com> schrieb am > >>> 15.09.2011 um > >>> > >> 16:06 in > >>> > >> Nachricht > >>> > >> <CAG6QY=3DLP6S1t=+ > jsq7qv+rmp5qqq02crnbea38nzbgvbd...@mail.gmail.com > >>> >: > >>> > >> > Hi all, > >>> > >> > > >>> > >> > Thanks for your advice, I have double checked mod_status in > Apache > >>> and > >>> > >> its > >>> > >> > definitely enabled on both nodes : > >>> > >> > ls /etc/apache2/mods-enabled > >>> > >> > alias.conf authz_user.load dir.conf > >>> reqtimeout.conf > >>> > >> > alias.load autoindex.conf dir.load > >>> reqtimeout.load > >>> > >> > auth_basic.load autoindex.load env.load > >>> setenvif.conf > >>> > >> > authn_file.load cgid.conf mime.conf > >>> setenvif.load > >>> > >> > authz_default.load cgid.load mime.load > >>> status.conf > >>> > >> > authz_groupfile.load deflate.conf negotiation.conf > >>> status.load > >>> > >> > authz_host.load deflate.load negotiation.load > >>> > >> > > >>> > >> > I have checked the status page http://node/server-status and I > can > >>> see > >>> > >> the > >>> > >> > status page ok. The mod_status is enabled on my node and runs > fine. > >>> > >> > > >>> > >> > I had a look at my apache log as you advised but I can't see > Apache > >>> > >> moaning > >>> > >> > about a specific error, apart from multiple stops and restarts > due > >>> to my > >>> > >> > tests : > >>> > >> > > >>> > >> > [Thu Sep 15 14:20:37 2011] [notice] caught SIGTERM, shutting > down > >>> > >> > [Thu Sep 15 14:20:38 2011] [notice] Apache/2.2.17 (Ubuntu) > >>> configured -- > >>> > >> > resuming normal operations > >>> > >> > [Thu Sep 15 14:20:38 2011] [error] (2)No such file or directory: > >>> > >> Couldn't > >>> > >> > bind unix domain socket /var/run/apache2/cgisock.4278 > >>> > >> > [Thu Sep 15 14:20:39 2011] [notice] caught SIGTERM, shutting > down > >>> > >> > > >>> > >> > That's for the primary node. It looks like Corosync shuts down > >>> Apache. > >>> > >> > On the Apache log file of the second node, I see the following : > >>> > >> > > >>> > >> > [Thu Sep 15 14:18:27 2011] [notice] Apache/2.2.17 (Ubuntu) > >>> configured -- > >>> > >> > resuming normal operations > >>> > >> > [Thu Sep 15 14:18:27 2011] [error] (2)No such file or directory: > >>> > >> Couldn't > >>> > >> > bind unix domain socket /var/run/apache2/cgisock.1338 > >>> > >> > [Thu Sep 15 14:18:28 2011] [crit] cgid daemon failed to > initialize > >>> > >> > > >>> > >> > I still have errors but the http service keep on running, no > >>> SIGTERM. > >>> > >> > > >>> > >> > And then my node status is : > >>> > >> > > >>> > >> > Online: [ node1 node2 ] > >>> > >> > > >>> > >> > Resource Group: group1 > >>> > >> > failover-ip (ocf::heartbeat:IPaddr): Started node1 > >>> > >> > apache (ocf::heartbeat:apache): Stopped > >>> > >> > > >>> > >> > Failed actions: > >>> > >> > apache_start_0 (node=node2, call=6, rc=1, status=complete): > >>> unknown > >>> > >> > error > >>> > >> > apache_monitor_0 (node=node1, call=3, rc=1, > status=complete): > >>> > >> unknown > >>> > >> > error > >>> > >> > apache_start_0 (node=node1, call=7, rc=1, status=complete): > >>> unknown > >>> > >> > error > >>> > >> > > >>> > >> > > >>> > >> > of interest, some information found in var/log/syslog on node1 : > >>> > >> > > >>> > >> > Sep 15 02:32:56 node1 crmd: [710]: info: do_state_transition: > All 2 > >>> > >> cluster > >>> > >> > nodes are eligible to run resources. > >>> > >> > Sep 15 02:32:56 node1 apache[928]: INFO: apache not running > >>> > >> > Sep 15 02:32:56 node1 apache[928]: INFO: waiting for apache > >>> > >> > /etc/apache2/apache2.conf to come up > >>> > >> > > >>> > >> > Sep 15 02:32:58 node1 apache[928]: INFO: Killing apache PID 995 > >>> > >> > Sep 15 02:32:59 node1 lrmd: [707]: info: RA output: > >>> > >> (apache:start:stderr) > >>> > >> > kill: 833: > >>> > >> > Sep 15 02:32:59 node1 lrmd: [707]: info: RA output: > >>> > >> (apache:start:stderr) No > >>> > >> > such process > >>> > >> > Sep 15 02:32:59 node1 lrmd: [707]: info: RA output: > >>> > >> (apache:start:stderr) > >>> > >> > Sep 15 02:32:59 node1 apache[928]: INFO: Killing apache PID 995 > >>> > >> > Sep 15 02:32:59 node1 apache[928]: INFO: apache stopped. > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: info: process_lrm_event: LRM > >>> > >> operation > >>> > >> > apache_start_0 (call=6, rc=1, cib-update=37, confirmed=true) > >>> unknown > >>> > >> error > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: WARN: status_from_rc: Action > 8 > >>> > >> > (apache_start_0) on node1 failed (target: 0 vs. rc: 1): Error > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: WARN: update_failcount: > Updating > >>> > >> > failcount for apache on node1 after failed start: rc=1 > >>> (update=INFINITY, > >>> > >> > time=1316050379) > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: info: abort_transition_graph: > >>> > >> > match_graph_event:272 - Triggered transition abort (complete=0, > >>> > >> > tag=lrm_rsc_op, id=apache_start_0, > >>> > >> > magic=0:1;8:3:0:a4e41810-3e8f-439a-9b92-489edf657291, > cib=0.172.10) > >>> : > >>> > >> Event > >>> > >> > failed > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: info: update_abort_priority: > >>> Abort > >>> > >> > priority upgraded from 0 to 1 > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: info: update_abort_priority: > >>> Abort > >>> > >> action > >>> > >> > done superceeded by restart > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: info: match_graph_event: > Action > >>> > >> > apache_start_0 (8) confirmed on node1 (rc=4) > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: info: run_graph: > >>> > >> > ==================================================== > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: notice: run_graph: Transition > 3 > >>> > >> > (Complete=3, Pending=0, Fired=0, Skipped=4, Incomplete=0, > >>> > >> > Source=/var/lib/pengine/pe-input-247.bz2): Stopped > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: info: te_graph_trigger: > >>> Transition 3 > >>> > >> is > >>> > >> > now complete > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: info: do_state_transition: > State > >>> > >> > transition S_TRANSITION_ENGINE -> S_POLICY_ENGINE [ > input=I_PE_CALC > >>> > >> > cause=C_FSA_INTERNAL origin=notify_crmd ] > >>> > >> > Sep 15 02:32:59 node1 crmd: [710]: info: do_state_transition: > All 2 > >>> > >> cluster > >>> > >> > nodes are eligible to run resources. > >>> > >> > > >>> > >> > > >>> > >> > > >>> > >> > > >>> > >> > Here is my crm configure show, is there anything I can change > there > >>> ? > >>> > >> > > >>> > >> > > >>> > >> > root@node1:/home/user# crm configure show > >>> > >> > node node1 \ > >>> > >> > attributes standby="off" > >>> > >> > node node2 \ > >>> > >> > attributes standby="off" > >>> > >> > primitive apache ocf:heartbeat:apache \ > >>> > >> > params configfile="/etc/apache2/apache2.conf" > >>> httpd="/usr/sbin/apache2" > >>> > >> \ > >>> > >> > op start interval="10" timeout="40s" \ > >>> > >> > op stop interval="10" timeout="60s" \ > >>> > >> > op monitor interval="5s" > >>> > >> > primitive failover-ip ocf:heartbeat:IPaddr \ > >>> > >> > params ip="192.168.0.105" \ > >>> > >> > op monitor interval="5s" > >>> > >> > group group1 failover-ip apache > >>> > >> > location cli-prefer-failover-ip failover-ip \ > >>> > >> > rule $id="cli-prefer-rule-failover-ip" inf: #uname eq node1 > >>> > >> > property $id="cib-bootstrap-options" \ > >>> > >> > dc-version="1.0.9-da7075976b5ff0bee71074385f8fd02f296ec8a3" \ > >>> > >> > cluster-infrastructure="openais" \ > >>> > >> > expected-quorum-votes="2" \ > >>> > >> > stonith-enabled="false" \ > >>> > >> > no-quorum-policy="ignore" > >>> > >> > > >>> > >> > > >>> > >> > Thank you for your help, > >>> > >> > > >>> > >> > Guillaume > >>> > >> > > >>> > >> > > >>> > >> > On 13 September 2011 08:29, Tim Serong <tser...@suse.com> > wrote: > >>> > >> > > >>> > >> > > On 13/09/11 00:39, Guillaume Bettayeb wrote: > >>> > >> > > > Hi there, > >>> > >> > > > > >>> > >> > > > This is my first post on this list, so hello everybody :) > >>> > >> > > > > >>> > >> > > > I am currently testing the fun of Linux HA Clustering (just > for > >>> > >> > > > personal interest) > >>> > >> > > > and I have successfully set up a tiny ubuntu virtualbox 2 > nodes > >>> > >> > > > cluster with Ip failover and Apache running as resources. > >>> > >> > > > > >>> > >> > > > Right after the install, I tried to move the resources from > a > >>> node > >>> > >> to > >>> > >> > > > the other (command standby) an everything worked like a > charm. > >>> > >> > > > Then I tried some failure tests, and started with a simple > >>> > >> > > > /etc/init.d/networking stop on one node, noticed that the > other > >>> node > >>> > >> > > > took ownership of the resources automatically, all was fine. > >>> > >> > > > > >>> > >> > > > Then I have rebooted the nodes just to see how they would > >>> restart > >>> > >> the > >>> > >> > > > cluster, and since I have the following error : > >>> > >> > > > > >>> > >> > > > apache_start_0 (node=node1, call=8, rc=1, status=complete): > >>> unknown > >>> > >> error > >>> > >> > > > > >>> > >> > > > For reading convenience, my outputs are available at > >>> > >> > > > http://pastebin.com/w1J4TWaG > >>> > >> > > > Just to clarify, that's : > >>> > >> > > > - crm configure show command > >>> > >> > > > - crm_mon status > >>> > >> > > > - All relevant information into my /var/log/syslog (although > I > >>> was > >>> > >> not > >>> > >> > > > sure what to look at, I never used corosync before) > >>> > >> > > > > >>> > >> > > > I have read on an older post that the apache error usually > have > >>> > >> > > > something to do with either timeout or mod status. > >>> > >> > > > As you can see on my pastebin, my timeout values are ok : > >>> > >> > > > op stop interval="60s" timeout="120" \ > >>> > >> > > > op start interval="60s" timeout="120" \ > >>> > >> > > > > >>> > >> > > > as for mod_status it's already enabled in Apache : > >>> > >> > > > root@node1:/etc/apache2# a2enmod status > >>> > >> > > > Module status already enabled > >>> > >> > > > > >>> > >> > > > Have I done anything wrong or is there anything else I > should > >>> > >> > > check/configure ? > >>> > >> > > > > >>> > >> > > > Any help with this matter would be greatly appreciated :) > >>> > >> > > > >>> > >> > > On a punt, it's probably mod_status. Check your Apache logs > at > >>> the > >>> > >> time > >>> > >> > > the start failed. If it's whining about a 403 or 404 for > >>> > >> /server-status > >>> > >> > > (or similar), you need to fix that in your Apache config. > >>> > >> > > > >>> > >> > > HTH, > >>> > >> > > > >>> > >> > > Tim > >>> > >> > > -- > >>> > >> > > Tim Serong > >>> > >> > > Senior Clustering Engineer > >>> > >> > > SUSE > >>> > >> > > tser...@suse.com > >>> > >> > > _______________________________________________ > >>> > >> > > Linux-HA mailing list > >>> > >> > > Linux-HA@lists.linux-ha.org > >>> > >> > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > >>> > >> > > See also: http://linux-ha.org/ReportingProblems > >>> > >> > > > >>> > >> > _______________________________________________ > >>> > >> > Linux-HA mailing list > >>> > >> > Linux-HA@lists.linux-ha.org > >>> > >> > http://lists.linux-ha.org/mailman/listinfo/linux-ha > >>> > >> > See also: http://linux-ha.org/ReportingProblems > >>> > >> > > >>> > >> > >>> > >> > >>> > >> > >>> > >> _______________________________________________ > >>> > >> Linux-HA mailing list > >>> > >> Linux-HA@lists.linux-ha.org > >>> > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >>> > >> See also: http://linux-ha.org/ReportingProblems > >>> > >> > >>> > > > >>> > > > >>> > _______________________________________________ > >>> > Linux-HA mailing list > >>> > Linux-HA@lists.linux-ha.org > >>> > http://lists.linux-ha.org/mailman/listinfo/linux-ha > >>> > See also: http://linux-ha.org/ReportingProblems > >>> _______________________________________________ > >>> Linux-HA mailing list > >>> Linux-HA@lists.linux-ha.org > >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >>> See also: http://linux-ha.org/ReportingProblems > >>> > >> > >> > > _______________________________________________ > > Linux-HA mailing list > > Linux-HA@lists.linux-ha.org > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > _______________________________________________ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > _______________________________________________ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems