Re: [Pacemaker] don't want to restart clone resource
Hi Andrew, Is crm_report included in pacemaker-1.0.12-1.el5.centos? I couldn't find it. 2012/2/4 Andrew Beekhof > On Fri, Feb 3, 2012 at 9:35 PM, Fanghao Sha wrote: > > Sorry, I don't know how to file a bug, > > See the links at the bottom of every mail on this list? > > > and i have only "messages" file. > > man crm_report > > > > > I have tried to set clone-max=3, and after removing node-1, the clone > > resource running on node-2 has not restart. > > But when I add another node-3 to cluster with "hb_addnode", the clone > > resource running on node-2 became orphaned and restart. > > > > As attached "messages" file, > > I couldn't understand this line: > > "find_clone: Internally renamed node-app-rsc:2 on node-2 to > node-app-rsc:3 > > (ORPHAN)". > > > > 2012/2/2 Andrew Beekhof > >> > >> On Thu, Feb 2, 2012 at 4:57 AM, Lars Ellenberg > >> wrote: > >> > On Wed, Feb 01, 2012 at 03:43:55PM +0100, Andreas Kurz wrote: > >> >> Hello, > >> >> > >> >> On 02/01/2012 10:39 AM, Fanghao Sha wrote: > >> >> > Hi Lars, > >> >> > > >> >> > Yes, you are right. But how to prevent the "orphaned" resources > from > >> >> > stopping by default, please? > >> >> > >> >> crm configure property stop-orphan-resources=false > >> > > >> > Well, sure. But for "normal" ophans, > >> > you actually want them to be stopped. > >> > > >> > No, pacemaker needs some additional smarts to recognize > >> > that there actually are no orphans, maybe by first relabling, > >> > and only then checking for instance label > clone-max. > >> > >> Instance label doesn't come into the equation. > >> It might look like it does on the outside, but its more complicated than > >> that. > >> > >> > > >> > Did you file a bugzilla? > >> > Has that made progress? > >> > > >> > > >> > -- > >> > : Lars Ellenberg > >> > : LINBIT | Your Way to High Availability > >> > : DRBD/HA support and consulting http://www.linbit.com > >> > > >> > ___ > >> > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > > >> > Project Home: http://www.clusterlabs.org > >> > Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> > Bugs: http://bugs.clusterlabs.org > >> > >> ___ > >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker > >> > >> Project Home: http://www.clusterlabs.org > >> Getting started: > http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > >> Bugs: http://bugs.clusterlabs.org > > > > > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker
On Tue, 2012-02-07 at 11:09 +0900, 李泰勲 wrote: > Hi Jiaju > > I am testing about working of booth while investigating booth source code. > > I don't understand ticket grant and revoke process perfectly that is > related to connecting each booth > so I would like to know booth's working that would be matching your > offering source code. > > Could you give me information about booth sequences that would be the > ticket's grant,revoke,lease logic and working of ticket's expiry time. General speaking, you would want to grant some ticket on certain site initially, which means the corresponding resources can be run at that site. For the lease logic, ticket granting means that site has the ticket lease. The lease has an expiry time, after the expiry time, that lease is expired and the corresponding resources can't be run at that site any longer. If the site which has the original ticket granting is alive, it will renew the lease before the ticket expired, but if that site is broken, when the lease is expired, the lease logic will go into election stage and a new site will get the ticket lease, thus the resources will be able to run at the new site. You can revoke the ticket from the site as well, but in most cases, you may not want to do this. The possible scenario I can think of is when the admin wants to do some maintenance work, or wants to do the ticket management manually. > > when do you think the booth's working is fixed and completed ? Oh, I have not finished it yet;) But I'm still working on it, since I also have some other tasks, maybe the progress is not fast these days;) > > Is there anything to help you about booth's implementation or etc? The framework is finished, but there are still some bugs in it, so the code may not work for you for the time being, I'll be more than happy if anyone can help to fix bugs, or develop new features;) For the short term, I think adding the man pages, documentation and some automation test programs/scripts would be very good. For the long term, I also have something new in my mind, maybe I should add a TODO to document it later. Well, the primary thing for now is to fix current bugs to make it really working, and I myself will spend more time on it these two weeks;) Thanks, Jiaju > > Best Regards, Taihun > > (2011/12/05 15:18), Jiaju Zhang wrote: > > Hello everyone, > > > > I'm happy to announce to the Booth cluster ticket manager, which is part > > of the key feature for pacemaker in 2011 - improving support for > > multi-site clusters. > > > > Multi-site clusters can be considered as “overlay” clusters where each > > cluster site corresponds to a cluster node in a traditional cluster. The > > overlay cluster is managed by the booth mechanism. It guarantees that > > the cluster resources will be highly available across different cluster > > sites takes. This is achieved by using so-called tickets that are > > treated as failover domain between cluster sites, in case a site should > > be down. > > > > Booth is designed to be an add-on of pacemaker, and now it is also > > hosted in ClusterLabs, together with pacemaker. You can find it from: > > > > https://github.com/ClusterLabs/booth > > > > Now, booth is still in heavy development, so it may not work for you for > > the time being;) But I'll be working on it ... > > > > Review and comments are highly appreciated! > > > > Thanks, > > Jiaju > > > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > > > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Error building Pacemaker on OS X Lion
On Wed, Feb 8, 2012 at 12:39 AM, i...@sdips.de wrote: > Am 06.02.12 22:00, schrieb Andrew Beekhof: >> On Tue, Feb 7, 2012 at 1:05 AM, i...@sdips.de wrote: >>> Am 29.01.12 23:03, schrieb Andrew Beekhof: On Thu, Jan 26, 2012 at 10:25 PM, i...@sdips.de wrote: > I've started all over with Macports. After some struggle with gettext, > the only working configure is working with --prefix=/opt/local. > But I stuck at the same issue to build pacemaker. > > ./configure --prefix=/opt/local --with-initdir=/private/etc/mach_init.d > --with-heartbeat > . > . > . > checking for struct lrm_ops.fail_rsc... yes > checking for ll_cluster_new in -lhbclient... no > configure: error: in `/Users/admin/1.1': > configure: error: Unable to support Heartbeat: client libraries not found > See `config.log' for more details > > > > The only error I've had during building was in glue that logd can't been > build. > Is this the missing part that prevents Pacemaker to build? > > cc1: warnings being treated as errors > ha_logd.c: In function ‘logd_make_daemon’: > ha_logd.c:527: warning: ‘daemon’ is deprecated (declared at > /usr/include/stdlib.h:292) > make[1]: *** [ha_logd.o] Error 1 > make: *** [all-recursive] Error 1 It might be necessary to configure with --disable-fatal-warnings (or something of that kind) >>> Sorry, doesn't work, either. >>> The build process finished without the previous error, but now >>> "shelfuncs" is now missing. >>> >>> /etc/mach_init.d/heartbeat start >>> /etc/mach_init.d/heartbeat: line 53: /opt/local/etc/ha.d/shellfuncs: >>> No such file or directory >>> >>> The file isn't present in the system, hence it wasn't build? >> There should be a similarly named file in the resource-agents package. >> Evidently they changed the name and forgot to update heartbeat. > > my fault, resource-agent haven't been installed yet. > and I'm running again in some new building errors ;( > > In file included from /opt/local/include/libnet.h:81, > from send_arp.libnet.c:44: > /opt/local/include/./libnet/libnet-functions.h:85: warning: function > declaration isn’t a prototype > In file included from send_arp.libnet.c:44: > /opt/local/include/libnet.h:87:2: error: #error "byte order has not > been specified, you'll need to #define either LIBNET_LIL_ENDIAN or > LIBNET_BIG_ENDIAN. See the documentation regarding the > libnet-config script." > send_arp.libnet.c: In function ‘main’: > send_arp.libnet.c:206: warning: comparison is always false due to > limited range of data type > make[3]: *** [send_arp-send_arp.libnet.o] Error 1 > make[2]: *** [all-recursive] Error 1 > make[1]: *** [all-recursive] Error 1 > make: *** [all] Error 2 > > libnet is installed via macports, the LIBNET_LIL_ENDIAN is defined in > /opt/local/bin/libnet-config. what's wrong now? The error message seems reasonably helpful. Did you read the documentation it refers to? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Proper way to migrate multistate resource?
OK that makes sense. I will give that a try. Thank you. -- Chet Burgess c...@liquidreality.org On Feb 7, 2012, at 5:08 , Lars Ellenberg wrote: > On Tue, Feb 07, 2012 at 02:03:32PM +0100, Michael Schwartzkopff wrote: >>> On Mon, Feb 06, 2012 at 04:48:26PM -0800, Chet Burgess wrote: Greetings, I'm some what new to pacemaker and have been playing around with a number of configurations in a lab. Most recently I've been testing a multistate resource using the ofc:pacemaker:Stateful example RA. While I've gotten the agent to work and notice that if I shutdown or kill a node the resources migrate I can't seem to figure out the proper way to migrate the resource between nodes when they are both up. For regular resources I've used "crm resource migrate " without issue. However when I try this with a multistate resource it doesn't seem to work. When I run the command it just puts the slave node into a stopped state. If I try and tell it to migrate specifically to the slave node it claims to already be running their (which I suppose in a sense it is). >>> >>> the crm shell does not support roles for the "move" or "migrate" command >>> (yet; maybe in newer versions. Dejan?). >>> >>> What you need to do is set a location constraint on the role. >>> * force master role off from one node: >>> >>> location you-name-it resource-id \ >>> rule $role=Master -inf: \ >>> #uname eq node-where-it-should-be-slave >>> >>> * or force master role off from all but one node, >>> note the double negation in this one: >>> >>> location you-name-it resource-id \ >>> rule $role=Master -inf: \ >>> #uname ne node-where-it-should-be-master >> >> These constraints would prevent the MS resource to run in Master state even >> on >> that node. Even in case the preferred node is not available any more. This >> might be not what Chet wanted. > > Well, it is just what crm resource migrate does, otherwise. > > After migration, you obviously need to "unmigrate", > i.e. delete that constraint again. > > > -- > : Lars Ellenberg > : LINBIT | Your Way to High Availability > : DRBD/HA support and consulting http://www.linbit.com > > DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Error building Pacemaker on OS X Lion
Am 06.02.12 22:00, schrieb Andrew Beekhof: > On Tue, Feb 7, 2012 at 1:05 AM, i...@sdips.de wrote: >> Am 29.01.12 23:03, schrieb Andrew Beekhof: >>> On Thu, Jan 26, 2012 at 10:25 PM, i...@sdips.de wrote: I've started all over with Macports. After some struggle with gettext, the only working configure is working with --prefix=/opt/local. But I stuck at the same issue to build pacemaker. ./configure --prefix=/opt/local --with-initdir=/private/etc/mach_init.d --with-heartbeat . . . checking for struct lrm_ops.fail_rsc... yes checking for ll_cluster_new in -lhbclient... no configure: error: in `/Users/admin/1.1': configure: error: Unable to support Heartbeat: client libraries not found See `config.log' for more details The only error I've had during building was in glue that logd can't been build. Is this the missing part that prevents Pacemaker to build? cc1: warnings being treated as errors ha_logd.c: In function ‘logd_make_daemon’: ha_logd.c:527: warning: ‘daemon’ is deprecated (declared at /usr/include/stdlib.h:292) make[1]: *** [ha_logd.o] Error 1 make: *** [all-recursive] Error 1 >>> It might be necessary to configure with --disable-fatal-warnings (or >>> something of that kind) >> Sorry, doesn't work, either. >> The build process finished without the previous error, but now >> "shelfuncs" is now missing. >> >>/etc/mach_init.d/heartbeat start >>/etc/mach_init.d/heartbeat: line 53: /opt/local/etc/ha.d/shellfuncs: >> No such file or directory >> >> The file isn't present in the system, hence it wasn't build? > There should be a similarly named file in the resource-agents package. > Evidently they changed the name and forgot to update heartbeat. my fault, resource-agent haven't been installed yet. and I'm running again in some new building errors ;( In file included from /opt/local/include/libnet.h:81, from send_arp.libnet.c:44: /opt/local/include/./libnet/libnet-functions.h:85: warning: function declaration isn’t a prototype In file included from send_arp.libnet.c:44: /opt/local/include/libnet.h:87:2: error: #error "byte order has not been specified, you'll need to #define either LIBNET_LIL_ENDIAN or LIBNET_BIG_ENDIAN. See the documentation regarding the libnet-config script." send_arp.libnet.c: In function ‘main’: send_arp.libnet.c:206: warning: comparison is always false due to limited range of data type make[3]: *** [send_arp-send_arp.libnet.o] Error 1 make[2]: *** [all-recursive] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all] Error 2 libnet is installed via macports, the LIBNET_LIL_ENDIAN is defined in /opt/local/bin/libnet-config. what's wrong now? >> In general, is it possible to get pacemaker running under OS X? >> Otherwise I'll stop investing more time in something that wasn't tested. > Its been a long time since I ran heartbeat anywhere, let alone on OSX. > It did work at one point though (and hasn't changed much since), you > might just need to tweak some init scripts > I appreciate any help. Am 25.01.12 01:27, schrieb Andrew Beekhof: > Have you been following this? > http://www.clusterlabs.org/wiki/Install#Darwin.2FMacOS_X > > On Tue, Jan 24, 2012 at 9:58 PM, i...@sdips.de wrote: >> Hi all, >> >> after a clean install of cluster-glue and heartbeat, I have a problem to >> build Pacemaker 1.1.6 under OS X Lion. >> >> With the ./configure --prefix=/usr/local >> --with-initdir=/private/etc/mach_init.d --with-heartbeat >> --libexecdir=/usr/libexec/ I run into the following issue: >> >> configure: error: in `/Users/admin/1.1': >> configure: error: Unable to support Heartbeat: client libraries not found >> See `config.log' for more details >> >> >> the "config.log" shows this: >> >> configure:4363: gcc -c conftest.c -o conftest2.o >&5 >> configure:4367: $? = 0 >> configure:4373: gcc -c conftest.c -o conftest2.o >&5 >> configure:4377: $? = 0 >> configure:4388: cc -c conftest.c >&5 >> configure:4392: $? = 0 >> configure:4400: cc -c conftest.c -o conftest2.o >&5 >> configure:4404: $? = 0 >> configure:4410: cc -c conftest.c -o conftest2.o >&5 >> configure:4414: $? = 0 >> configure:4432: result: yes >> configure:4461: checking for gcc option to accept ISO C99 >> configure:4610: gcc -c -g -O2 conftest.c >&5 >> conftest.c:62: error: expected ';', ',' or ')' before 'text' >> conftest.c: In function 'main': >> conftest.c:116: error: nested functions are disabled, use >> -fnested-functions to re-enable >> conftest.c:116: error: expected '=', ',', ';', 'asm' or '__attribute__' >> before 'newvar' >> conftest.c:116: error: 'newvar' undeclared (first use in this function) >> conftest.c:116: error: (Each un
Re: [Pacemaker] Proper way to migrate multistate resource?
On Tue, Feb 07, 2012 at 02:03:32PM +0100, Michael Schwartzkopff wrote: > > On Mon, Feb 06, 2012 at 04:48:26PM -0800, Chet Burgess wrote: > > > Greetings, > > > > > > I'm some what new to pacemaker and have been playing around with a > > > number of configurations in a lab. Most recently I've been testing a > > > multistate resource using the ofc:pacemaker:Stateful example RA. > > > > > > While I've gotten the agent to work and notice that if I shutdown or > > > kill a node the resources migrate I can't seem to figure out the > > > proper way to migrate the resource between nodes when they are both > > > up. > > > > > > For regular resources I've used "crm resource migrate " without > > > issue. However when I try this with a multistate resource it doesn't > > > seem to work. When I run the command it just puts the slave node into > > > a stopped state. If I try and tell it to migrate specifically to the > > > slave node it claims to already be running their (which I suppose in a > > > sense it is). > > > > the crm shell does not support roles for the "move" or "migrate" command > > (yet; maybe in newer versions. Dejan?). > > > > What you need to do is set a location constraint on the role. > > * force master role off from one node: > > > > location you-name-it resource-id \ > > rule $role=Master -inf: \ > > #uname eq node-where-it-should-be-slave > > > > * or force master role off from all but one node, > >note the double negation in this one: > > > > location you-name-it resource-id \ > > rule $role=Master -inf: \ > > #uname ne node-where-it-should-be-master > > These constraints would prevent the MS resource to run in Master state even > on > that node. Even in case the preferred node is not available any more. This > might be not what Chet wanted. Well, it is just what crm resource migrate does, otherwise. After migration, you obviously need to "unmigrate", i.e. delete that constraint again. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Proper way to migrate multistate resource?
> On Mon, Feb 06, 2012 at 04:48:26PM -0800, Chet Burgess wrote: > > Greetings, > > > > I'm some what new to pacemaker and have been playing around with a > > number of configurations in a lab. Most recently I've been testing a > > multistate resource using the ofc:pacemaker:Stateful example RA. > > > > While I've gotten the agent to work and notice that if I shutdown or > > kill a node the resources migrate I can't seem to figure out the > > proper way to migrate the resource between nodes when they are both > > up. > > > > For regular resources I've used "crm resource migrate " without > > issue. However when I try this with a multistate resource it doesn't > > seem to work. When I run the command it just puts the slave node into > > a stopped state. If I try and tell it to migrate specifically to the > > slave node it claims to already be running their (which I suppose in a > > sense it is). > > the crm shell does not support roles for the "move" or "migrate" command > (yet; maybe in newer versions. Dejan?). > > What you need to do is set a location constraint on the role. > * force master role off from one node: > > location you-name-it resource-id \ > rule $role=Master -inf: \ > #uname eq node-where-it-should-be-slave > > * or force master role off from all but one node, >note the double negation in this one: > > location you-name-it resource-id \ > rule $role=Master -inf: \ > #uname ne node-where-it-should-be-master These constraints would prevent the MS resource to run in Master state even on that node. Even in case the preferred node is not available any more. This might be not what Chet wanted. Perhaps it would be easier if you give the resource some points if it runs in Master state on the preferred node: location name-of-your-constraint resource-id \ rule $role=Master 100: \ #uname eq name-of-the-preferred-node -- Dr. Michael Schwartzkopff Guardinistr. 63 81375 München Tel: (0163) 172 50 98 Fax: (089) 620 304 13 signature.asc Description: This is a digitally signed message part. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Proper way to migrate multistate resource?
On Mon, Feb 06, 2012 at 04:48:26PM -0800, Chet Burgess wrote: > Greetings, > > I'm some what new to pacemaker and have been playing around with a > number of configurations in a lab. Most recently I've been testing a > multistate resource using the ofc:pacemaker:Stateful example RA. > > While I've gotten the agent to work and notice that if I shutdown or > kill a node the resources migrate I can't seem to figure out the > proper way to migrate the resource between nodes when they are both > up. > > For regular resources I've used "crm resource migrate " without > issue. However when I try this with a multistate resource it doesn't > seem to work. When I run the command it just puts the slave node into > a stopped state. If I try and tell it to migrate specifically to the > slave node it claims to already be running their (which I suppose in a > sense it is). the crm shell does not support roles for the "move" or "migrate" command (yet; maybe in newer versions. Dejan?). What you need to do is set a location constraint on the role. * force master role off from one node: location you-name-it resource-id \ rule $role=Master -inf: \ #uname eq node-where-it-should-be-slave * or force master role off from all but one node, note the double negation in this one: location you-name-it resource-id \ rule $role=Master -inf: \ #uname ne node-where-it-should-be-master Cheers, Lars --- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com > The only method I've found to safely and reliable migrate a multistate > resource from one node to another is I think it has something to do > with the resource constraints I used to prefer a particular node, but > I'm not entirely sure how the constraints and the master/slave state > updating stuff works. > > Am I using the wrong tool to migrate a multistate resource or is my > configuration wrong in some way? Any input greatly appreciated. > Thank you. > > > Configuration: > r...@tst3.local1.mc:/home/cfb$ crm configure show > node tst3.local1.mc.metacloud.com > node tst4.local1.mc.metacloud.com > primitive stateful-test ocf:pacemaker:Stateful \ > op monitor interval="30s" role="Slave" \ > op monitor interval="31s" role="Master" > ms ms-test stateful-test \ > meta clone-node-max="1" notify="false" master-max="1" > master-node-max="1" target-role="Master" > location ms-test_constraint_1 ms-test 25: tst3.local1.mc.metacloud.com > location ms-test_constraint_2 ms-test 20: tst4.local1.mc.metacloud.com > property $id="cib-bootstrap-options" \ > cluster-infrastructure="openais" \ > dc-version="1.1.5-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \ > last-lrm-refresh="1325273678" \ > expected-quorum-votes="2" \ > no-quorum-policy="ignore" \ > stonith-enabled="false" > rsc_defaults $id="rsc-options" \ > resource-stickiness="100" > > -- > Chet Burgess > c...@liquidreality.org > > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Where is MAXMSG defined?
On Tue, Feb 07, 2012 at 01:13:19PM +0200, Adrian Fita wrote: > Hi. > > I can't find any trace of "define MAXMSG" in either pacemaker, > corosync, heartbeat's source code. I tried with "grep -R 'MAXMSG' *" > and nothing. Where is it defined?! If you are asking about what I think you do, then that would be in glue, include/clplumbing/ipc.h But be careful, when fiddling with it. What are you trying to solve, btw? -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Where is MAXMSG defined?
Hi. I can't find any trace of "define MAXMSG" in either pacemaker, corosync, heartbeat's source code. I tried with "grep -R 'MAXMSG' *" and nothing. Where is it defined?! Thanks. -- Fita Adrian ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Stopping heartbeat service on one node lead to restart of resources on other node in cluster
Hello, I have a 2 node cluster with following configuration: **node $id="9e53a111-0dca-496c-9461-a38f3eec4d0e" mcg2 \ attributes standby="off" node $id="a90981f8-d993-4411-89f4-aff7156136d2" mcg1 \ attributes standby="off" primitive ClusterIP ocf:mcg:MCG_VIPaddr_RA \ params ip="192.168.115.50" cidr_netmask="255.255.255.0" nic="bond1.115:1" \ op monitor interval="40" timeout="20" \ meta target-role="Started" primitive EMS ocf:heartbeat:jboss \ params jboss_home="/opt/jboss-5.1.0.GA" java_home="/opt/jdk1.6.0_29/" \ op start interval="0" timeout="240" \ op stop interval="0" timeout="240" \ op monitor interval="30s" timeout="40s" primitive NDB_MGMT ocf:mcg:NDB_MGM_RA \ op monitor interval="120" timeout="120" primitive NDB_VIP ocf:heartbeat:IPaddr2 \ params ip="192.168.117.50" cidr_netmask="255.255.255.255" nic="bond0.117:1" \* * op monitor interval="30" timeout="10" primitive Rmgr ocf:mcg:RM_RA \ op monitor interval="60" role="Master" timeout="30" on-fail="restart" \ op monitor interval="40" role="Slave" timeout="40" on-fail="restart" primitive Tmgr ocf:mcg:TM_RA \ op monitor interval="60" role="Master" timeout="30" on-fail="restart" \ op monitor interval="40" role="Slave" timeout="40" on-fail="restart" primitive mysql ocf:mcg:MYSQLD_RA \ op monitor interval="180" timeout="200" primitive ndbd ocf:mcg:NDBD_RA \ op monitor interval="120" timeout="120" primitive pimd ocf:mcg:PIMD_RA \ op monitor interval="60" role="Master" timeout="30" on-fail="restart" \ op monitor interval="40" role="Slave" timeout="40" on-fail="restart" ms ms_Rmgr Rmgr \ meta master-max="1" master-max-node="1" clone-max="2" clone-node-max="1" interleave="true" notify="true" ms ms_Tmgr Tmgr \ meta master-max="1" master-max-node="1" clone-max="2" clone-node-max="1" interleave="true" notify="true" ms ms_pimd pimd \ meta master-max="1" master-max-node="1" clone-max="2" clone-node-max="1" interleave="true" notify="true" clone EMS_CLONE EMS \ meta globally-unique="false" clone-max="2" clone-node-max="1" target-role="Started" clone mysqld_clone mysql \ meta globally-unique="false" clone-max="2" clone-node-max="1" clone ndbdclone ndbd \ meta globally-unique="false" clone-max="2" clone-node-max="1" target-role="Started" colocation ip_with_Pimd inf: ClusterIP ms_pimd:Master colocation ip_with_RM inf: ClusterIP ms_Rmgr:Master colocation ip_with_TM inf: ClusterIP ms_Tmgr:Master colocation ndb_vip-with-ndb_mgm inf: NDB_MGMT NDB_VIP order RM-after-mysqld inf: mysqld_clone ms_Rmgr order TM-after-RM inf: ms_Rmgr ms_Tmgr order ip-after-pimd inf: ms_pimd ClusterIP order mysqld-after-ndbd inf: ndbdclone mysqld_clone order pimd-after-TM inf: ms_Tmgr ms_pimd property $id="cib-bootstrap-options" \ dc-version="1.0.11-55a5f5be61c367cbd676c2f0ec4f1c62b38223d7" \ cluster-infrastructure="Heartbeat" \ no-quorum-policy="ignore" \ stonith-enabled="false" rsc_defaults $id="rsc-options" \ migration_threshold="3" \ resource-stickiness="100"* *With both nodes up and running, if heartbeat service is stopped on any of the nodes, following resources are restarted on the other node: mysqld_clone, ms_Rmgr, ms_Tmgr, ms_pimd, ClusterIP >From the Heartbeat debug logs, it seems policy engine is initiating a restart operation for the above resources but the reason for the same is not clear. Following are some excerpts from the logs: "*Feb 07 11:06:31 MCG1 pengine: [20534]: info: determine_online_status: Node mcg2 is shutting down Feb 07 11:06:31 MCG1 pengine: [20534]: info: determine_online_status: Node mcg1 is online Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Master/Slave Set: ms_Rmgr Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Rmgr:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_ac**tive: Resource Rmgr:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource**Rmgr:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Rmgr:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Masters: [ mcg1 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Slaves: [ mcg2 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: clone_print: Master/Slave Set: ms_Tmgr Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:0 active on mcg1 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: debug: native_active: Resource Tmgr:1 active on mcg2 Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Masters: [ mcg1 ] Feb 07 11:06:31 MCG1 pengine: [20534]: notice: short_print: Slaves: [ mcg2 ] Feb 07 11:06:31 MCG1 pengin