Re: [Pacemaker] resources not failing over to standby node when primary node powered off
On Mon, Sep 6, 2010 at 5:03 PM, Gerry Kernan wrote: > Hi > > > > I have a 2 node cluster. I have a drbd:filesystem rescouce plus a IPaddr2 > resource and 3 LSB init resources to start https, asterisk and > orderlystatse. I can migrate the resources manually but if i power off the > primary node the resources don’t fail over. > > The output of crm configure show is below. Would “attributes > standby="off"” be causing this problem , > Shouldn't be. > if so what should it be. > We'd need to see the full cib (including status section) when the cluster is in this state. Attach the result of cibadmin -Ql > > > > > node ho-asterisk1-11315.interlink.local \ > > attributes standby="off" > > node ho-asterisk2-11314.interlink.local > > primitive res_Filesystem_drbd ocf:heartbeat:Filesystem \ > > params device="/dev/drbd0" directory="/rep" fstype="ext3" \ > > operations $id="res_Filesystem_drbd-operations" \ > > op start interval="0" timeout="60" \ > > op stop interval="0" timeout="60" \ > > op monitor interval="20" timeout="40" start-delay="0" \ > > meta target-role="started" > > primitive res_IPaddr2_IPaddr ocf:heartbeat:IPaddr2 \ > > params ip="10.1.2.97" cidr_netmask="255.255.0.0" > broadcast="10.1.255.255" \ > > operations $id="res_IPaddr2_IPaddr-operations" \ > > op start interval="0" timeout="20" \ > > op stop interval="0" timeout="20" \ > > op monitor interval="10" timeout="20" start-delay="0" \ > > meta target-role="started" > > primitive res_asterisk_asterisk lsb:asterisk \ > > operations $id="res_asterisk_asterisk-operations" \ > > op start interval="0" timeout="15" \ > > op stop interval="0" timeout="15" \ > > op monitor interval="15" timeout="15" start-delay="15" \ > > meta target-role="started" > > primitive res_drbd_1 ocf:linbit:drbd \ > > params drbd_resource="asterisk" \ > > operations $id="res_drbd_1-operations" \ > > op start interval="0" timeout="240" \ > > op promote interval="0" timeout="90" \ > > op demote interval="0" timeout="90" \ > > op stop interval="0" timeout="100" \ > > op monitor interval="10" timeout="20" start-delay="1min" \ > > meta target-role="started" > > primitive res_httpd_http lsb:httpd \ > > operations $id="res_httpd_http-operations" \ > > op start interval="0" timeout="15" \ > > op stop interval="0" timeout="15" \ > > op monitor interval="15" timeout="15" start-delay="15" > > primitive res_mysqld_mysql lsb:mysqld \ > > operations $id="res_mysqld_mysql-operations" \ > > op start interval="0" timeout="15" \ > > op stop interval="0" timeout="15" \ > > op monitor interval="15" timeout="15" start-delay="15" > > primitive res_orderlystatsse_orderlyse lsb:orderlystatsse \ > > operations $id="res_orderlystatsse_orderlyse-operations" \ > > op start interval="0" timeout="15" \ > > op stop interval="0" timeout="15" \ > > op monitor interval="15" timeout="15" start-delay="15" \ > > meta resource-stickiness="0" > > group asterisk res_Filesystem_drbd res_IPaddr2_IPaddr res_asterisk_asterisk > res_httpd_http res_mysqld_mysql res_orderlystatsse_orderlyse > > ms ms_drbd_1 res_drbd_1 \ > > meta clone-max="2" notify="true" > > colocation col_res_Filesystem_drbd_ms_drbd_1 inf: asterisk ms_drbd_1:Master > > order ord_ms_drbd_1_res_Filesystem_drbd inf: ms_drbd_1:promote > asterisk:start > > property $id="cib-bootstrap-options" \ > > default-resource-stickiness="1000" \ > > expected-quorum-votes="2" \ > > stonith-enabled="false" \ > > stonith-action="poweroff" \ > > dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677" > \ > > no-quorum-policy="ignore" \ > > cluster-infrastructure="openais" \ > > last-lrm-refresh="1283443619" > > > > Regards, > > Gerry Kernan > > InfinityIT > > > > Suite 17 The Mall, > > Beacon court, > > Sandyford, > > Dublin 18. > > > > p:+353-1-2930090 > > f:+353-1-2930137 > > > > [image: logome] > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > <>
[Pacemaker] Problem with Depends and Conflicts - apt-get install pacemaker heartbeat - using backports.debian.org and www.backports.org
Hi, I'm experiencing some issues to install pacemaker on debian lenny: This are my source.list ## main & security repositories deb http://ftp.us.debian.org/debian/ lenny main deb-src http://ftp.us.debian.org/debian/ lenny main deb http://security.debian.org/ lenny/updates main deb-src http://security.debian.org/ lenny/updates main # ClusterLabs repository for HA components #deb http://people.debian.org/~madkiss/ha lenny main deb http://backports.debian.org/debian-backports lenny-backports main contrib non-free deb http://www.backports.org/debian lenny-backports main contrib non-free This a new install. My system verion is i386 2.6.26-2-286 Debian Lenny I had done this steps: apt-get update apt-get upgrade apt-get install pacemaker heartbeat Reading package lists... Done Building dependency tree Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: pacemaker: Depends: libcluster-glue but it is not going to be installed Depends: libheartbeat2 (>= 1:3.0.3) but it is not going to be installed Depends: cluster-agents but it is not going to be installed E: Broken packages I have tryed to install with cluster-glue and cluster-agents, but the dependency list grows up: debian01:~# apt-get install pacemaker heartbeat cluster-glue cluster-agents Reading package lists... Done Building dependency tree Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: cluster-agents: Depends: libcluster-glue but it is not going to be installed Conflicts: heartbeat (<= 2.99.2+sles11r9-5) but 2.1.3-6lenny4is to be installed cluster-glue: Depends: libcluster-glue but it is not going to be installed Conflicts: heartbeat (<= 2.99.2+sles11r9-5) but 2.1.3-6lenny4 isto be installed pacemaker: Depends: libcluster-glue but it is not going to be installed Depends: libheartbeat2 (>= 1:3.0.3) but it is not going to be insta lled E: Broken packages Someone can help to solve this problem? Tks, LCR ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] cib fails to start until host is rebooted
On Mon, 6 Sep 2010, Andrew Beekhof wrote: > >> Is /dev/shm full (or not mounted) by any chance? > > > > No - I tried clearing that out, too. > > And corosync is actually running? Yes, it's logging "[IPC ] Invalid IPC credentials." when cib tries to connect. Mike ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] resources not failing over to standby node when primary node powered off
Hi I have a 2 node cluster. I have a drbd:filesystem rescouce plusa IPaddr2 resource and 3 LSB init resources to start https, asterisk andorderlystatse. I can migrate the resources manually but if i power off theprimary node the resources dont fail over. The output of crm configure show is below. Would attributesstandby="off" be causing this problem , if so what should itbe. node ho-asterisk1-11315.interlink.local \ attributesstandby="off" node ho-asterisk2-11314.interlink.local primitive res_Filesystem_drbd ocf:heartbeat:Filesystem \ paramsdevice="/dev/drbd0" directory="/rep"fstype="ext3" \ operations$id="res_Filesystem_drbd-operations" \ opstart interval="0" timeout="60" \ opstop interval="0" timeout="60" \ opmonitor interval="20" timeout="40"start-delay="0" \ metatarget-role="started" primitive res_IPaddr2_IPaddr ocf:heartbeat:IPaddr2 \ paramsip="10.1.2.97" cidr_netmask="255.255.0.0"broadcast="10.1.255.255" \ operations$id="res_IPaddr2_IPaddr-operations" \ opstart interval="0" timeout="20" \ opstop interval="0" timeout="20" \ opmonitor interval="10" timeout="20"start-delay="0" \ metatarget-role="started" primitive res_asterisk_asterisk lsb:asterisk \ operations$id="res_asterisk_asterisk-operations" \ opstart interval="0" timeout="15" \ opstop interval="0" timeout="15" \ opmonitor interval="15" timeout="15"start-delay="15" \ metatarget-role="started" primitive res_drbd_1 ocf:linbit:drbd \ paramsdrbd_resource="asterisk" \ operations$id="res_drbd_1-operations" \ opstart interval="0" timeout="240" \ oppromote interval="0" timeout="90" \ opdemote interval="0" timeout="90" \ opstop interval="0" timeout="100" \ opmonitor interval="10" timeout="20"start-delay="1min" \ metatarget-role="started" primitive res_httpd_http lsb:httpd \ operations$id="res_httpd_http-operations" \ opstart interval="0" timeout="15" \ opstop interval="0" timeout="15" \ opmonitor interval="15" timeout="15"start-delay="15" primitive res_mysqld_mysql lsb:mysqld \ operations$id="res_mysqld_mysql-operations" \ opstart interval="0" timeout="15" \ opstop interval="0" timeout="15" \ opmonitor interval="15" timeout="15"start-delay="15" primitive res_orderlystatsse_orderlyse lsb:orderlystatsse \ operations$id="res_orderlystatsse_orderlyse-operations" \ opstart interval="0" timeout="15" \ opstop interval="0" timeout="15" \ opmonitor interval="15" timeout="15"start-delay="15" \ metaresource-stickiness="0" group asterisk res_Filesystem_drbd res_IPaddr2_IPaddrres_asterisk_asterisk res_httpd_http res_mysqld_mysqlres_orderlystatsse_orderlyse ms ms_drbd_1 res_drbd_1 \ metaclone-max="2" notify="true" colocation col_res_Filesystem_drbd_ms_drbd_1 inf: asteriskms_drbd_1:Master order ord_ms_drbd_1_res_Filesystem_drbd inf:ms_drbd_1:promote asterisk:start property $id="cib-bootstrap-options" \ default-resource-stickiness="1000"\ expected-quorum-votes="2"\ stonith-enabled="false"\ stonith-action="poweroff"\ dc-version="1.0.9-89bd754939df5150de7cd76835f98fe90851b677"\ no-quorum-policy="ignore"\ cluster-infrastructure="openais"\ last-lrm-refresh="1283443619" Regards, Gerry Kernan InfinityIT Suite 17 The Mall, Beacon court, Sandyford, Dublin 18. p:+353-1-2930090 f:+353-1-2930137 <>___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] bugzilla #2480 - group-node-node crm_mon prints
Hello Andrew, is there a reason not also to print "FAILED" in crm_mons group-by-node mode? The commit http://hg.clusterlabs.org/pacemaker/1.1/rev/9084e64bce3a only has half of the 2nd patch I attached to the bugzilla. As we have dozens of resources on many hosts, we always use group-by-node. Actually we use a wrapper that calls "crm_mon -1 -r -n" to give us the cluster status. Besides the so far missing "unmanaged" flag, "FAILED" is also an important missing information. Thanks, Bernd -- Bernd Schubert DataDirect Networks ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] Can't mkfs.gfs2 on second node of 2
After setup of a 2 node cluster following cluster from scratch guide for fedora 13, i can't create a gfs2 filesystem on my second node. With corosync stopped says "could not stat device /dev/drbd1". When corosync is started and filesystem stopped says "read only filesystem". Any hint? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Resync too slow! Cat /proc/drbd shows 240k/s
That is it! Thank you a lot Dan! Date: Mon, 6 Sep 2010 10:27:32 +0300 From: dfri...@streamwide.ro To: pacemaker@oss.clusterlabs.org Subject: Re: [Pacemaker] Resync too slow! Cat /proc/drbd shows 240k/s Message body Alisson Landim wrote: Hi. After setup a 2 node cluster from cluster from scratch guide using Fedora 13 i saw that resync of data is too slow. Cat /proc/drbd shows 240k/s If you look at the cluster from scratch guide here: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch07s02s03.html you can see that the speed of this example is 240k too. How to increase this speed? Check /etc/drbd.conf for the rate parameter. On a Gigabit Ethernet I use 40M. syncer { rate 40M; } See more here: http://www.drbd.org/users-guide/s-configure-syncer-rate.html Regards, Dan. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker -- Dan FRINCU Systems Engineer CCNA, RHCE Streamwide Romania ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Node doesn't rejoin automatically after reboot
Yes, corosync is running after the reboot. It comes up with the regular init-procedure (runlevel 3 in my case). 2010/9/6 Andrew Beekhof : > On Mon, Sep 6, 2010 at 7:57 AM, Tom Tux wrote: >> No, I don't have such failed-messages. In my case, the "Connection to >> our AIS plugin" was established. >> >> The /dev/shm is also not full. > > Is corosync running? > >> Kind regards, >> Tom >> >> 2010/9/3 Michael Smith : >>> Tom Tux wrote: >>> If I disjoin one clusternode (node01) for maintenance-purposes (/etc/init.d/openais stop) and reboot this node, then it will not join himself automatically into the cluster. After the reboot, I have the following error- and warn-messages in the log: Sep 3 07:34:15 node01 mgmtd: [9202]: info: login to cib failed: live >>> >>> Do you have messages like this, too? >>> >>> Aug 30 15:48:10 xen-test1 corosync[5851]: [IPC ] Invalid IPC credentials. >>> Aug 30 15:48:10 xen-test1 cib: [5858]: info: init_ais_connection: >>> Connection to our AIS plugin (9) failed: unknown (100) >>> >>> Aug 30 15:48:10 xen-test1 cib: [5858]: CRIT: cib_init: Cannot sign in to >>> the cluster... terminating >>> >>> >>> http://news.gmane.org/find-root.php?message_id=%3c4C7C0EC7.2050708%40cbnco.com%3e >>> >>> Mike >>> >>> ___ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: >>> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >>> >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: >> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker >> > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] How can I build pacemaker supporting ais without corosync source code?
On Mon, Sep 6, 2010 at 9:43 AM, Jingcheng zhang wrote: > Dear Andrew, > I modify a plugin in pacemaker and want to build pacemaker independently > to a RPM package. But when I run configure, got error about "choose one > cluster stack to support". I configure the environment varaiable > PKG_CONFIG_PATH to point to corosync source code directory, and build > sucessfully. How can I choose to support corosync but don't include corosync > source code? > Uh, you can't. Thats impossible. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] How can I build pacemaker supporting ais without corosync source code?
Dear Andrew, I modify a plugin in pacemaker and want to build pacemaker independently to a RPM package. But when I run configure, got error about "choose one cluster stack to support". I configure the environment varaiable PKG_CONFIG_PATH to point to corosync source code directory, and build sucessfully. How can I choose to support corosync but don't include corosync source code? Thanks Jason ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Resync too slow! Cat /proc/drbd shows 240k/s
Alisson Landim wrote: Hi. After setup a 2 node cluster from cluster from scratch guide using Fedora 13 i saw that resync of data is too slow. Cat /proc/drbd shows 240k/s If you look at the cluster from scratch guide here: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch07s02s03.html you can see that the speed of this example is 240k too. How to increase this speed? Check /etc/drbd.conf for the rate parameter. On a Gigabit Ethernet I use 40M. syncer { rate 40M; } See more here: http://www.drbd.org/users-guide/s-configure-syncer-rate.html Regards, Dan. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker -- Dan FRINCU Systems Engineer CCNA, RHCE Streamwide Romania ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Resync too slow! Cat /proc/drbd shows 240k/s
Probably a better question for the drbd list. Though some of the guys hang out here too. On Mon, Sep 6, 2010 at 9:03 AM, Alisson Landim wrote: > Hi. > > After setup a 2 node cluster from cluster from scratch guide using Fedora 13 > i saw that resync of data is too slow. > Cat /proc/drbd shows 240k/s > If you look at the cluster from scratch guide here: > > http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch07s02s03.html > > you can see that the speed of this example is 240k too. > > How to increase this speed? > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] Resync too slow! Cat /proc/drbd shows 240k/s
Hi. After setup a 2 node cluster from cluster from scratch guide using Fedora 13 i saw that resync of data is too slow. Cat /proc/drbd shows 240k/s If you look at the cluster from scratch guide here: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch07s02s03.html you can see that the speed of this example is 240k too. How to increase this speed? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker