I'd be checking your apache logs, my guess is that it doesn't like the config. Or see where/why the Apache RA could be returning 1.
On Mon, Oct 3, 2011 at 5:58 PM, Miltiadis Koutsokeras <m.koutsoke...@biovista.com> wrote: > Hi again, > > I have gathered all interesting config and log files to a single archive. > See the attachment. Thanks in advance for any help/advise. > > Miltos > > On 10/02/2011 06:19 PM, Miltiadis Koutsokeras wrote: >> >> Hi Nick, >> >> Here is the output of the "crm configure show": >> >> node node-0 >> node node-1 >> primitive Apache2 ocf:heartbeat:apache \ >> params configfile="/etc/apache2/apache2.conf" \ >> op monitor interval="1min" \ >> meta target-role="Started" >> primitive ClusterIP ocf:heartbeat:IPaddr2 \ >> params ip="192.168.0.100" cidr_netmask="32" \ >> op monitor interval="30s" \ >> meta target-role="Started" >> colocation Apache2-ClusterIP-colocation inf: Apache2 ClusterIP >> order Apache2-after-ClusterIP inf: ClusterIP Apache2 >> property $id="cib-bootstrap-options" \ >> dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ >> cluster-infrastructure="openais" \ >> expected-quorum-votes="2" \ >> stonith-enabled="false" \ >> no-quorum-policy="ignore" >> rsc_defaults $id="rsc-options" \ >> resource-stickiness="100" >> >> If you wish anything else, please feel free to ask. >> >> On 10/01/2011 02:50 PM, Nick Khamis wrote: >>> >>> Can you post your crm please. >>> >>> Nick. >>> >>> On Sat, Oct 1, 2011 at 6:32 AM, Miltiadis Koutsokeras >>> <m.koutsoke...@biovista.com> wrote: >>>> >>>> Hello everyone, >>>> >>>> My goal is to build a Round Robin balanced, HA Apache Web server >>>> cluster. >>>> The >>>> main purpose is to balance HTTP requests evenly between the nodes and >>>> have >>>> one >>>> machine pickup all requests if and ONLY if the others are not available >>>> at >>>> the >>>> moment. The cluster will be accessible only from internal network. Any >>>> advise on >>>> this will be highly appreciated (resources to use, services to install >>>> and >>>> configure etc.). After walking through ClusterLabs documentation, I >>>> think >>>> the >>>> proper deployment is an active/active Pacemaker managed cluster. >>>> >>>> I'm trying to follow the "Cluster from scratch" article in order to >>>> build a >>>> 2 >>>> node cluster on an experimental setup: >>>> >>>> 2 GNU/Linux Debian Unstable (sid) Virtual Machines (Kernel >>>> 3.0.0-1-686-pae, >>>> Apache/2.2.21 (Debian)) on same LAN network. >>>> >>>> node-0 IP: 192.168.0.101 >>>> node-1 IP: 192.168.0.102 >>>> Desired Cluster Virtual IP: 192.168.0.100 >>>> >>>> The two nodes are setup to communicate with proper SSH keys and it works >>>> flawlessly. Also they can communicate with short names: >>>> >>>> root@node-0:~# ssh node-1 -- hostname >>>> node-1 >>>> >>>> root@node-1:~# ssh node-0 -- hostname >>>> node-0 >>>> >>>> My problem is that although I've reached the part where you have the >>>> ClusterIP >>>> resource setup properly, the Apache resource does not get started in >>>> either >>>> node. The logs do not have a message explaining the failure in detail, >>>> even >>>> with >>>> debug messages enabled. All related messages report unknown errors while >>>> trying >>>> to start the service and after a while the cluster manager gives up. >>>> From >>>> the >>>> messages it seems like the manager is getting unexpected exit codes from >>>> the >>>> Apache resource. The server-status URL is accessible from 127.0.0.1 in >>>> both >>>> nodes. >>>> >>>> root@node-0:~# crm_mon -1 >>>> ============ >>>> Last updated: Fri Sep 30 14:04:55 2011 >>>> Stack: openais >>>> Current DC: node-1 - partition with quorum >>>> Version: 1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f >>>> 2 Nodes configured, 2 expected votes >>>> 2 Resources configured. >>>> ============ >>>> >>>> Online: [ node-1 node-0 ] >>>> >>>> ClusterIP (ocf::heartbeat:IPaddr2): Started node-1 >>>> >>>> Failed actions: >>>> Apache2_monitor_0 (node=node-0, call=3, rc=1, status=complete): >>>> unknown >>>> error >>>> Apache2_start_0 (node=node-0, call=5, rc=1, status=complete): unknown >>>> error >>>> Apache2_monitor_0 (node=node-1, call=8, rc=1, status=complete): >>>> unknown >>>> error >>>> Apache2_start_0 (node=node-1, call=10, rc=1, status=complete): >>>> unknown >>>> error >>>> >>>> Let's checkout the logs for this resource: >>>> >>>> root@node-0:~# grep ERROR.*Apache2 /var/log/corosync/corosync.log >>>> (Nothing) >>>> >>>> root@node-0:~# grep WARN.*Apache2 /var/log/corosync/corosync.log >>>> Sep 30 14:04:23 node-0 lrmd: [2555]: WARN: Managed Apache2:monitor >>>> process >>>> 2802 exited with return code 1. >>>> Sep 30 14:04:30 node-0 lrmd: [2555]: WARN: Managed Apache2:start process >>>> 2942 exited with return code 1. >>>> >>>> root@node-1:~# grep ERROR.*Apache2 /var/log/corosync/corosync.log >>>> Sep 30 14:04:23 node-1 pengine: [1676]: ERROR: native_create_actions: >>>> Resource Apache2 (ocf::apache) is active on 2 nodes attempting recovery >>>> >>>> root@node-1:~# grep WARN.*Apache2 /var/log/corosync/corosync.log >>>> Sep 30 14:04:23 node-1 lrmd: [1674]: WARN: Managed Apache2:monitor >>>> process >>>> 3006 exited with return code 1. >>>> Sep 30 14:04:23 node-1 crmd: [1677]: WARN: status_from_rc: Action 5 >>>> (Apache2_monitor_0) on node-1 failed (target: 7 vs. rc: 1): Error >>>> Sep 30 14:04:23 node-1 crmd: [1677]: WARN: status_from_rc: Action 7 >>>> (Apache2_monitor_0) on node-0 failed (target: 7 vs. rc: 1): Error >>>> Sep 30 14:04:23 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-0: unknown error (1) >>>> Sep 30 14:04:23 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-1: unknown error (1) >>>> Sep 30 14:04:30 node-1 crmd: [1677]: WARN: status_from_rc: Action 10 >>>> (Apache2_start_0) on node-0 failed (target: 0 vs. rc: 1): Error >>>> Sep 30 14:04:30 node-1 crmd: [1677]: WARN: update_failcount: Updating >>>> failcount for Apache2 on node-0 after failed start: rc=1 >>>> (update=INFINITY, >>>> time=1317380670) >>>> Sep 30 14:04:31 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-0: unknown error (1) >>>> Sep 30 14:04:31 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_start_0 on node-0: unknown error (1) >>>> Sep 30 14:04:31 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-1: unknown error (1) >>>> Sep 30 14:04:31 node-1 pengine: [1676]: WARN: common_apply_stickiness: >>>> Forcing Apache2 away from node-0 after 1000000 failures (max=1000000) >>>> Sep 30 14:04:31 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-0: unknown error (1) >>>> Sep 30 14:04:31 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_start_0 on node-0: unknown error (1) >>>> Sep 30 14:04:31 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-1: unknown error (1) >>>> Sep 30 14:04:31 node-1 pengine: [1676]: WARN: common_apply_stickiness: >>>> Forcing Apache2 away from node-0 after 1000000 failures (max=1000000) >>>> Sep 30 14:04:36 node-1 lrmd: [1674]: WARN: Managed Apache2:start process >>>> 3146 exited with return code 1. >>>> Sep 30 14:04:36 node-1 crmd: [1677]: WARN: status_from_rc: Action 9 >>>> (Apache2_start_0) on node-1 failed (target: 0 vs. rc: 1): Error >>>> Sep 30 14:04:36 node-1 crmd: [1677]: WARN: update_failcount: Updating >>>> failcount for Apache2 on node-1 after failed start: rc=1 >>>> (update=INFINITY, >>>> time=1317380676) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-0: unknown error (1) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_start_0 on node-0: unknown error (1) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-1: unknown error (1) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_start_0 on node-1: unknown error (1) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: common_apply_stickiness: >>>> Forcing Apache2 away from node-1 after 1000000 failures (max=1000000) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: common_apply_stickiness: >>>> Forcing Apache2 away from node-0 after 1000000 failures (max=1000000) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-0: unknown error (1) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_start_0 on node-0: unknown error (1) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-1: unknown error (1) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_start_0 on node-1: unknown error (1) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: common_apply_stickiness: >>>> Forcing Apache2 away from node-1 after 1000000 failures (max=1000000) >>>> Sep 30 14:04:37 node-1 pengine: [1676]: WARN: common_apply_stickiness: >>>> Forcing Apache2 away from node-0 after 1000000 failures (max=1000000) >>>> Sep 30 14:13:38 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-0: unknown error (1) >>>> Sep 30 14:13:38 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_start_0 on node-0: unknown error (1) >>>> Sep 30 14:13:38 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-1: unknown error (1) >>>> Sep 30 14:13:38 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_start_0 on node-1: unknown error (1) >>>> Sep 30 14:13:38 node-1 pengine: [1676]: WARN: common_apply_stickiness: >>>> Forcing Apache2 away from node-1 after 1000000 failures (max=1000000) >>>> Sep 30 14:13:38 node-1 pengine: [1676]: WARN: common_apply_stickiness: >>>> Forcing Apache2 away from node-0 after 1000000 failures (max=1000000) >>>> Sep 30 14:13:52 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_monitor_0 on node-1: unknown error (1) >>>> Sep 30 14:13:52 node-1 pengine: [1676]: WARN: unpack_rsc_op: Processing >>>> failed op Apache2_start_0 on node-1: unknown error (1) >>>> Sep 30 14:13:52 node-1 pengine: [1676]: WARN: common_apply_stickiness: >>>> Forcing Apache2 away from node-1 after 1000000 failures (max=1000000) >>>> Sep 30 14:13:52 node-1 pengine: [1676]: WARN: common_apply_stickiness: >>>> Forcing Apache2 away from node-0 after 1000000 failures (max=1000000) >>>> >>>> Any suggestions? >>>> >>>> File /etc/corosync/corosync.conf (Only changes here , see attached for >>>> full >>>> file) >>>> >>>> # Please read the openais.conf.5 manual page >>>> >>>> totem { >>>> >>>> ... (Default) >>>> >>>> interface { >>>> # The following values need to be set based on your environment >>>> ringnumber: 0 >>>> bindnetaddr: 192.168.0.0 >>>> mcastaddr: 226.94.1.1 >>>> mcastport: 5405 >>>> } >>>> } >>>> >>>> ... (Default) >>>> >>>> service { >>>> # Load the Pacemaker Cluster Resource Manager >>>> ver: 1 >>>> name: pacemaker >>>> } >>>> >>>> ... (Default) >>>> >>>> logging { >>>> fileline: off >>>> to_stderr: no >>>> to_logfile: yes >>>> logfile: /var/log/corosync/corosync.log >>>> to_syslog: no >>>> syslog_facility: daemon >>>> debug: on >>>> timestamp: on >>>> logger_subsys { >>>> subsys: AMF >>>> debug: off >>>> tags: enter|leave|trace1|trace2|trace3|trace4|trace6 >>>> } >>>> } >>>> >>>> -- >>>> Koutsokeras Miltiadis M.Sc. >>>> Software Engineer >>>> Biovista Inc. >>>> >>>> US Offices >>>> 2421 Ivy Road >>>> Charlottesville, VA 22903 >>>> USA >>>> T: +1.434.971.1141 >>>> F: +1.434.971.1144 >>>> >>>> European Offices >>>> 34 Rodopoleos Street >>>> Ellinikon, Athens 16777 >>>> GREECE >>>> T: +30.210.9629848 >>>> F: +30.210.9647606 >>>> >>>> www.biovista.com >>>> >>>> Biovista is a privately held biotechnology company that finds novel uses >>>> for >>>> existing drugs, and profiles their side effects using their mechanism of >>>> action. Biovista develops its own pipeline of drugs in CNS, oncology, >>>> auto-immune and rare diseases. Biovista is collaborating with >>>> biopharmaceutical companies on indication expansion and de-risking of >>>> their >>>> portfolios and with the FDA on adverse event prediction. >>>> >>>> >>>> >>>> _______________________________________________ >>>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>>> >>>> Project Home: http://www.clusterlabs.org >>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>>> Bugs: >>>> >>>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>>> >>>> >>> _______________________________________________ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: >>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> >> > > > -- > Koutsokeras Miltiadis M.Sc. > Software Engineer > Biovista Inc. > > US Offices > 2421 Ivy Road > Charlottesville, VA 22903 > USA > T: +1.434.971.1141 > F: +1.434.971.1144 > > European Offices > 34 Rodopoleos Street > Ellinikon, Athens 16777 > GREECE > T: +30.210.9629848 > F: +30.210.9647606 > > www.biovista.com > > Biovista is a privately held biotechnology company that finds novel uses for > existing drugs, and profiles their side effects using their mechanism of > action. Biovista develops its own pipeline of drugs in CNS, oncology, > auto-immune and rare diseases. Biovista is collaborating with > biopharmaceutical companies on indication expansion and de-risking of their > portfolios and with the FDA on adverse event prediction. > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker