Re: [Pacemaker] error installing CentOS clvm after using clusterlabs repository
On Wed, Aug 4, 2010 at 3:05 AM, Michael Fung wrote: > Thanks to all who helped give hints. > > > I switched to Debian Squeeze. > > I don't want to spend time to study RHCS of RHEL 5 if Pacemaker/Corosync > is the future. Life is short. F-13 or the RHEL-6 betas also have all the bits you need (including pacemaker). > > > Rgds, > Michael > > > On 2010/8/3 下午 03:29, Brett Delle Grazie wrote: >> Hi Mike, >> >> In RHEL 5.x and CentOS 5.x you must use CMAN and the RedHat Cluster >> Suite (RHCS) if you are going to used clustered LVM. >> >> This is because clvmd currently uses the CMAN interface to the cluster. >> In later versions, RedHat is moving towards Corosync / OpenAIS / >> (Pacemaker | RgManager) solution but this will take a long time. >> >> Christine Caufield (from RedHat) wrote an excellent document describing >> the change process here: >> http://people.redhat.com/ccaulfie/docs/Whither%20cman.pdf >> >> I guess your options are: >> (a) Switch to RHCS based cluster, at least for those nodes with >> clustered LVM requirements (and GFS, GFS2 etc) >> (b) Switch to RHEL 6.x Beta >> (c) Try recompiling RHEL 6.x Beta packages - no guarantees here but it >> should be possible, maybe. >> (d) Try compiling current source of lvm2-cluster packages from Fedora or >> Rawhide as they can use current versions of OpenAIS. The RHEL 5.x >> versions of lvm2-cluster are fixed at using CMAN interface, not OpenAIS >> (e) Switch to Debian based distro - Lenny is production ready and has >> CLVM / Pacemaker / Corosync in backports ;) >> (f) Something someone else on the list with more experience comes up >> with :) >> > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] [Problem]A compilation error of Pacemaker1.1.
Hi, I compiled Pacemaker1.1. But, the next error happened. [r...@srv01 Pacemaker-1-1-5ce5b34cf3ab]# export PREFIX=/usr;export LCRSODIR=$PREFIX/libexec/lcrso;export CLUSTER_USER=hacluster;export CLUSTER_GROUP=haclient [r...@srv01 Pacemaker-1-1-5ce5b34cf3ab]# ./autogen.sh && ./configure --prefix=$PREFIX --localstatedir=/var --with-lcrso-dir=$LCRSODIR [r...@srv01 Pacemaker-1-1-5ce5b34cf3ab]# make install s -Werror -fPIC -MT utils.lo -MD -MP -MF .deps/utils.Tpo -c utils.c -fPIC -DPIC -o .libs/utils.o cc1: warnings being treated as errors utils.c:65: warning: 'common' defined but not used gmake[2]: *** [utils.lo] Error 1 gmake[2]: Leaving directory `/opt/Pacemaker-1-1-5ce5b34cf3ab/lib/common' gmake[1]: *** [install-recursive] Error 1 gmake[1]: Leaving directory `/opt/Pacemaker-1-1-5ce5b34cf3ab/lib' make: *** [install-recursive] Error 1 My environment is as follows. * RHEL5.5(x64) * corosync 1.2.7 * Pacemaker-1-1-5ce5b34cf3ab.tar * Cluster-Resource-Agents-bfcc4e050a07.tar * Reusable-Cluster-Components-8286b46c91e3.tar Is there a problem in my compilation procedure? Best Regards, Hideo Yamauchi. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] [PATCH]A redundant if sentence.
Hi, It is the patch of a redundant if sentence for pengine. void unpack_operation( action_t *action, xmlNode *xml_obj, pe_working_set_t* data_set) { (snip) if(safe_str_eq(class, "stonith")) { action->needs = rsc_req_nothing; value = "nothing (fencing op)"; } else if(value == NULL && safe_str_neq(action->task, CRMD_ACTION_START)) { (snip) } else if(data_set->no_quorum_policy == no_quorum_ignore || safe_str_eq(class, "stonith")) { *** ---> A redundant if sentence. action->needs = rsc_req_nothing; value = "nothing (default)"; } else if(data_set->no_quorum_policy == no_quorum_freeze && is_set(data_set->flags, pe_flag_stonith_enabled)) { (snip) Best Regards, Hideo Yamauchi. redundant_if_sentence.patch Description: 205491641-redundant_if_sentence.patch ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] [PATCH]The changing of the log level of pengine process.
Hi Andrew, Thank you for comment. It is difficult for me to illustrate by English. This patch is a considerably special demand of our user. Even if an STONITH resource duplicated, node and STONITH done STONITH of are that the log of the node to do wants to output it by warning. Last updated: Thu Jul 29 09:58:36 2010 Stack: openais Current DC: srv01 - partition WITHOUT quorum Version: 1.0.9-74392a28b7f31d7ddc86689598bd23114f58978b 2 Nodes configured, 2 expected votes 2 Resources configured. Online: [ srv01 srv02 ] Resource Group: group-1 prmDummy1 (ocf::heartbeat:Dummy): Started srv01 stonith0 (stonith:external/ssh): Started srv01 For example, STONITH moves to srv02 when srv01 is done STONITH of in the case of resource placement such as the above. Our user wanted to change the log of this time. However, I understand that it is a very special demand. I wish that I get possible to appoint such a special demand with a start option of Pacemaker if possible. Best Regards, Hideo Yamauchi. --- Andrew Beekhof wrote: > 2010/7/29 : > > Hi All, > > > > Our user showed a demand in a level of log output after handling of pengine. > > > > When STONITH is carried out, pengine wants to output log at a warning level > > if a repeating > resource is > > only an STONITH resource. > > > > Because plural STONITH may be started when STONITH is carried out. > > However, it is because the importance of the problem is different from the > > plural start of the > normal > > resource. > > I'm having trouble understanding the purpose of this patch... > > If the only resource on a node to be fenced is a stonith resource, and > that resource is also running on another node, then unset > "was_processing_error"... is that right? > > Why do that? > > > > > I wrote the patch which operated a was_processing_error flag to answer the > > demand of our user. > > > > This patch may be considerably special. > > I do not think that all users need this patch. > > > > Please talk to me an opinion for this patch. > > > > Best Regards, > > Hideo Yamauchi. > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: > > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] error installing CentOS clvm after using clusterlabs repository
Thanks to all who helped give hints. I switched to Debian Squeeze. I don't want to spend time to study RHCS of RHEL 5 if Pacemaker/Corosync is the future. Life is short. Rgds, Michael On 2010/8/3 下午 03:29, Brett Delle Grazie wrote: > Hi Mike, > > In RHEL 5.x and CentOS 5.x you must use CMAN and the RedHat Cluster > Suite (RHCS) if you are going to used clustered LVM. > > This is because clvmd currently uses the CMAN interface to the cluster. > In later versions, RedHat is moving towards Corosync / OpenAIS / > (Pacemaker | RgManager) solution but this will take a long time. > > Christine Caufield (from RedHat) wrote an excellent document describing > the change process here: > http://people.redhat.com/ccaulfie/docs/Whither%20cman.pdf > > I guess your options are: > (a) Switch to RHCS based cluster, at least for those nodes with > clustered LVM requirements (and GFS, GFS2 etc) > (b) Switch to RHEL 6.x Beta > (c) Try recompiling RHEL 6.x Beta packages - no guarantees here but it > should be possible, maybe. > (d) Try compiling current source of lvm2-cluster packages from Fedora or > Rawhide as they can use current versions of OpenAIS. The RHEL 5.x > versions of lvm2-cluster are fixed at using CMAN interface, not OpenAIS > (e) Switch to Debian based distro - Lenny is production ready and has > CLVM / Pacemaker / Corosync in backports ;) > (f) Something someone else on the list with more experience comes up > with :) > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] [Problem]The problem of the combination of Pacemaker and corosync1.2.7.
Hi Andrew, Thank you for comment. > No need to wait, the current tip of Pacemaker 1.1 is perfectly stable > (and included for RHEL6.0). > Almost all the testing has been done for 1.1.3, I've just been busy > helping out with some other projects at Red Hat and haven't had time > to do the actual release. > > To make use of CPG-based communication, remove the "service" section > for pacemaker from corosync.conf and instead run: >service pacemaker start > after starting corosync. > > Once the 1.1.3 packages are out, this will be the official advice for > anyone experiencing startup/shutdown issues when using Pacemaker with > Corosync. > Calling fork() in a multi-threaded environment (corosync) is just far > too problematic. All right. My problem of all may be broken off with 1.1.3. I try Pacemaker1.1.3 from now on. What time does the release of Pacemaker1.1 seem to become it? Best Regards, Hideo Yamauchi. --- Andrew Beekhof wrote: > On Mon, Aug 2, 2010 at 3:17 AM, wrote: > > Hi, > > > > I confirmed movement when corosync1.2.7 combined Pacemaker. > > > > The combination is as follows. > > > > �* corosync 1.2.7 > > �* Pacemaker-1-0-74392a28b7f3.tar > > �* Cluster-Resource-Agents-bfcc4e050a07.tar > > �* Reusable-Cluster-Components-8286b46c91e3.tar > > > > > > I confirmed the next movement in two nodes of a virtual machine(RHEL5.5 > > x84) and the real > > machine(RHEL5.5 x64). > > The resource arranged nothing. > > > > 1) When it started only in corosync, a node do not be hung up.(and when > > stopped) > > 2) When I put Pacemaker and corosync together and started, a node do not be > > hung up.(and when > stopped) > > > > Only 20 number of times carried out the confirmation in each > > environment.(x86 and x64) > > > > Unfortunately the following problem occurred. > > �* The problem did not happen by the start only for corosync this > > time.(and when stopped) > > > > Problem 1) By the start of the virtual machine, a virtual machine is > > sometimes hungup. > > � � � � � Like a former problem, it is > > used nearly 100% for the CPU. > > > > Problem 2) There was the case that cannot constitute a cluster after start. > > > > Problem 3) There is a case to fail in the start of a cib process and the > > attrd process. > > > > Jul 30 14:25:46 x3650g attrd: [26258]: ERROR: ais_dispatch: Receiving > > message body failed: (2) > Library > > error: Resource temporarily unavailable (11) > > Jul 30 14:25:46 x3650g attrd: [26258]: ERROR: ais_dispatch: AIS connection > > failed > > Jul 30 14:25:46 x3650g cib: [26256]: ERROR: ais_dispatch: Receiving message > > body failed: (2) > Library > > error: Resource temporarily unavailable (11) > > Jul 30 14:25:46 x3650g cib: [26256]: ERROR: ais_dispatch: AIS connection > > failed > > Jul 30 14:25:46 x3650g attrd: [26258]: CRIT: attrd_ais_destroy: Lost > > connection to OpenAIS > service! > > Jul 30 14:25:46 x3650g cib: [26256]: ERROR: cib_ais_destroy: AIS connection > > terminated > > Jul 30 14:25:46 x3650g attrd: [26258]: info: main: Exiting... > > Jul 30 14:25:46 x3650g attrd: [26258]: ERROR: attrd_cib_connection_destroy: > > Connection to the > CIB > > terminated... > > Jul 30 14:25:46 x3650g stonithd: [26255]: ERROR: ais_dispatch: Receiving > > message body failed: > (2) > > Library error: Success (0) > > Jul 30 14:25:46 x3650g stonithd: [26255]: ERROR: ais_dispatch > > > > Can this problem be settled in Pacemaker1.0 and corosync1.2.7? > > > > I know that a revision to replace communication with CPG in structure of > > new Pacemaker begins. > > When we combine corosync and use it, should we wait for a revision of CPG > > to be over? > > (Should we wait for Pacemaker1.1 system?) > > No need to wait, the current tip of Pacemaker 1.1 is perfectly stable > (and included for RHEL6.0). > Almost all the testing has been done for 1.1.3, I've just been busy > helping out with some other projects at Red Hat and haven't had time > to do the actual release. > > To make use of CPG-based communication, remove the "service" section > for pacemaker from corosync.conf and instead run: >service pacemaker start > after starting corosync. > > Once the 1.1.3 packages are out, this will be the official advice for > anyone experiencing startup/shutdown issues when using Pacemaker with > Corosync. > Calling fork() in a multi-threaded environment (corosync) is just far > too problematic. > > > > > Because log is big, I contact it again after registering this problem with > > bugzilla. > > > > Best Regards, > > Hideo Yamauchi. > > > > > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: > > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > > > __
Re: [Pacemaker] [Problem]The problem of the combination of Pacemaker and corosync1.2.7.
Hi Vladislav, Thank you for comment. > This is probably connected to > http://marc.info/?l=openais&m=127977785007234&w=2 > > Steven promised to look at that issue after his vacation. I wait for a revision of Steven. Meanwhile, I use Pacemaker1.1 to recommend of Andrew. Best Regards, Hideo Yamauchi. --- Vladislav Bogdanov wrote: > 02.08.2010 04:17, renayama19661...@ybb.ne.jp wrote: > > ... > > > Problem 3) There is a case to fail in the start of a cib process and the > > attrd process. > > > > Jul 30 14:25:46 x3650g attrd: [26258]: ERROR: ais_dispatch: Receiving > > message body failed: (2) > Library > > error: Resource temporarily unavailable (11) > > Jul 30 14:25:46 x3650g attrd: [26258]: ERROR: ais_dispatch: AIS connection > > failed > > Jul 30 14:25:46 x3650g cib: [26256]: ERROR: ais_dispatch: Receiving message > > body failed: (2) > Library > > error: Resource temporarily unavailable (11) > > Jul 30 14:25:46 x3650g cib: [26256]: ERROR: ais_dispatch: AIS connection > > failed > > Jul 30 14:25:46 x3650g attrd: [26258]: CRIT: attrd_ais_destroy: Lost > > connection to OpenAIS > service! > > Jul 30 14:25:46 x3650g cib: [26256]: ERROR: cib_ais_destroy: AIS connection > > terminated > > Jul 30 14:25:46 x3650g attrd: [26258]: info: main: Exiting... > > Jul 30 14:25:46 x3650g attrd: [26258]: ERROR: attrd_cib_connection_destroy: > > Connection to the > CIB > > terminated... > > Jul 30 14:25:46 x3650g stonithd: [26255]: ERROR: ais_dispatch: Receiving > > message body failed: > (2) > > Library error: Success (0) > > Jul 30 14:25:46 x3650g stonithd: [26255]: ERROR: ais_dispatch > > > > Can this problem be settled in Pacemaker1.0 and corosync1.2.7? > > > > I know that a revision to replace communication with CPG in structure of > > new Pacemaker begins. > > When we combine corosync and use it, should we wait for a revision of CPG > > to be over? > > (Should we wait for Pacemaker1.1 system?) > > > > Because log is big, I contact it again after registering this problem with > > bugzilla. > > > > > This is probably connected to > http://marc.info/?l=openais&m=127977785007234&w=2 > > Steven promised to look at that issue after his vacation. > > > Best, > Vladislav > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] what's the deal with 1.0.9 init_ais_connection?
On Tue, Aug 3, 2010 at 7:18 PM, Alan Jones wrote: > I'm trying to work a cib seg fault in init_ais_connection() for pacemaker > 1.0.9. Don't. Use 1.0.9.1 which already has the patch below. > The 1.0.8 version of this function is pretty stright forward, calling one of > the > comm stack's connect functions depending on the config. > In 1.0.9, however, it appears to be a recursive call that never ends. > There is also a init_ais_connection_once() below that appears to be the > intended function to call within this function. > Is it safe for me to make this change? > Alan > --- > ajo...@ajones-dl:~/hasrc/Pacemaker-1-0-Pacemaker-1.0.9/lib/common$ diff -c > ais.c.org ais.c > *** ais.c.org 2010-06-23 03:25:30.0 -0700 > --- ais.c 2010-08-03 10:20:38.320875334 -0700 > *** > *** 582,588 > { > int retries = 0; > while(retries++ < 30) { > ! int rc = init_ais_connection(dispatch, destroy, our_uuid, our_uname, > nodeid); > switch(rc) { > case CS_OK: > return TRUE; > --- 582,588 > { > int retries = 0; > while(retries++ < 30) { > ! int rc = init_ais_connection_once(dispatch, destroy, our_uuid, > our_uname, nodeid); > switch(rc) { > case CS_OK: > return TRUE; > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: > http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker > > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] what's the deal with 1.0.9 init_ais_connection?
I'm trying to work a cib seg fault in init_ais_connection() for pacemaker 1.0.9. The 1.0.8 version of this function is pretty stright forward, calling one of the comm stack's connect functions depending on the config. In 1.0.9, however, it appears to be a recursive call that never ends. There is also a init_ais_connection_once() below that appears to be the intended function to call within this function. Is it safe for me to make this change? Alan --- ajo...@ajones-dl:~/hasrc/Pacemaker-1-0-Pacemaker-1.0.9/lib/common$ diff -c ais.c.org ais.c *** ais.c.org2010-06-23 03:25:30.0 -0700 --- ais.c2010-08-03 10:20:38.320875334 -0700 *** *** 582,588 { int retries = 0; while(retries++ < 30) { ! int rc = init_ais_connection(dispatch, destroy, our_uuid, our_uname, nodeid); switch(rc) { case CS_OK: return TRUE; --- 582,588 { int retries = 0; while(retries++ < 30) { ! int rc = init_ais_connection_once(dispatch, destroy, our_uuid, our_uname, nodeid); switch(rc) { case CS_OK: return TRUE; ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Two cloned VM, only one of the both shows online when starting corosync/pacemaker
On Mon, Jul 26, 2010 at 7:08 PM, Guillaume Chanaud wrote: Le 26/07/2010 14:38, Andrew Beekhof a écrit : On Fri, Jul 16, 2010 at 5:44 PM, Guillaume Chanaud wrote: Hello, [snip] # Optionally assign a fixed node id (integer) nodeid: 30283707487 In addition to changing the node's ip address, did you change this too? Yes, as i said in the mail i changed the nodeid and bindnetaddress for each node. Even non fixed values doesn't work. Looks like corosync crashed. Possibly related to the duplicate nodeid? Was there a core file in /var/run/corosync? what does "ulimit -c" say? There is a /var/run/corosync.pid (whith the correct pid) which stay there even after the corosync crash. For ulimit : [r...@www01 run]# ulimit -c 0 These values are the same on the second node running fine (and if i stop this second node, the first node will run fine, but not the second...) Yes, but that doesn't help us find out why the first one is crashing. Please run "ulimit -c unlimited" before starting corosync so that we can get a core file and stack trace. Hello, sorry for the delay it took, july is not the best month to get things working fast. Here is the core dump file (55MB) : http://www.connecting-nature.com/corosync/core corosync version is 1.2.3 thanks for your help ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] Cluster Amnesia problem in Pacemaker
Thanks for clarification ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
[Pacemaker] testing resources (was: Crazy idea #1)
Hi, On Mon, Aug 02, 2010 at 11:07:46PM +0200, Lars Marowsky-Bree wrote: > On 2010-08-02T19:05:53, Dejan Muhamedagic wrote: > > > Testing/starting/etc resources is easy, but the shell doesn't > > know about dependencies. > > I think this might not even be needed for the first step. Clearly, a > mode to run a specific single resource would be required. > > I do think "ra test" is the right level for this; but maybe the "ra" > command hierarchy should be available from configure too? It is, because it is useful to get the RA documentation while defining resources. > ra test The reason to have it running from the configure level is that that is the place where resources are defined: resource names and their parameters. The ra level has no such infrastructure. > If node is not given, run locally. If is not given, run > ocf-tester. The shell doesn't have capability to run operations on other nodes. > (And really, I mean run ocf-tester - I don't want to duplicate the test > case logic in more than one place.) > > In the first step, the user would be required to ensure all dependencies > are up and available, or brought online manually before. > > > We can introduce a new sublevel at configure to allow fiddling with > > resource operations at will, but it would be better to have, say, > > ptest provide information about the order of resource operations. > > I can see the value here too - but this gets complex really quickly, if > the ordering/locations are not local only, or if clones get involved. It is true that it would be difficult to cover all the cases. > (We end up single stepping through the transition graph, basically. That > may be quite worthwhile to implement, but seems to be a quite different > scope, and may best be implemented via a "debugger" interface to the > crmd, so that the shell can interactively trace + modify the transition? > Word up for complexity! ;-) > > > order. For instance, after creating new resources, the user would > > just say "test" before commit and the shell would run the > > following: > > > > test A (ocf-tester) > > start A > > test B > > start B > > test C > > stop B > > stop A > > I _think_ we are fine if we allow the "ra test" command to operate on > groups too. > > But if we try to implement the "test" mode so that it resolves > dependencies, we will end up creating an expectation that it always > works, and that the shell figures out where to run stuff etc. I'd > rather have something simple that we can guarantee always works, than > something complex that will lead to many bug reports ;-) Not sure, but I suspect that the implementation of testing groups wouldn't be much different from gathering dependencies from some external source. At any rate, we may start with something less ambitious. Cheers, Dejan P.S. Moving the discussion to the user list and adjusting the subject. > Regards, > Lars > > -- > Architect Storage/HA, OPS Engineering, Novell, Inc. > SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg) > "Experience is the name everyone gives to their mistakes." -- Oscar Wilde > > > ___ > Pcmk-devel mailing list > pcmk-de...@oss.clusterlabs.org > http://oss-2.clusterlabs.org/mailman/listinfo/pcmk-devel ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] error installing CentOS clvm after using clusterlabs repository
Hi, The clvm resource agent is upcoming in Fedora 13 as part of an updated heartbeat/pacemaker package. FC 13 has already moved to Corosync/OpenAIS/Pacemaker/DRBD and FC 13 should be familiar to you regarding CentOS. Regards, Martijn Sprengers Disclaimer: Dit bericht is alleen bestemd voor de geadresseerden. Aan dit bericht kunnen geen rechten worden ontleend. -Oorspronkelijk bericht- Van: Brett Delle Grazie [mailto:brett.dellegra...@intact-is.com] Verzonden: dinsdag 3 augustus 2010 9:30 Aan: m...@3open.org; pacemaker@oss.clusterlabs.org Onderwerp: Re: [Pacemaker] error installing CentOS clvm after using clusterlabs repository Hi Mike, In RHEL 5.x and CentOS 5.x you must use CMAN and the RedHat Cluster Suite (RHCS) if you are going to used clustered LVM. This is because clvmd currently uses the CMAN interface to the cluster. In later versions, RedHat is moving towards Corosync / OpenAIS / (Pacemaker | RgManager) solution but this will take a long time. Christine Caufield (from RedHat) wrote an excellent document describing the change process here: http://people.redhat.com/ccaulfie/docs/Whither%20cman.pdf I guess your options are: (a) Switch to RHCS based cluster, at least for those nodes with clustered LVM requirements (and GFS, GFS2 etc) (b) Switch to RHEL 6.x Beta (c) Try recompiling RHEL 6.x Beta packages - no guarantees here but it should be possible, maybe. (d) Try compiling current source of lvm2-cluster packages from Fedora or Rawhide as they can use current versions of OpenAIS. The RHEL 5.x versions of lvm2-cluster are fixed at using CMAN interface, not OpenAIS (e) Switch to Debian based distro - Lenny is production ready and has CLVM / Pacemaker / Corosync in backports ;) (f) Something someone else on the list with more experience comes up with :) Good luck, please let us know how you get on. Best Regards, Brett On Tue, 2010-08-03 at 10:29 +0800, Michael Fung wrote: > Hi all, > > > I am using the following repository to install pacemaker and corosync: > > [clusterlabs] > name=High Availability/Clustering server technologies (epel-5) > baseurl=http://www.clusterlabs.org/rpm/epel-5 > ... > > The cluster is working good. > > Later, I want to use clvm, that is the lvm2-cluster package. yum get it > from the CentOS repository, but the dependencies are broken. It seems > the openais package from clusterlabs is different from the CentOS. > > I skipped the dependencies and force installed related library files. > Finally I got clvmd to run but it complains: > > Starting clvmd: clvmd could not connect to cluster manager > Consult syslog for more information > > > Any ideas please? > > > Rgds, > Michael > > > -- Best Regards, Brett Delle Grazie __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] error installing CentOS clvm after using clusterlabs repository
03.08.2010 10:29, Brett Delle Grazie wrote: ... > (c) Try recompiling RHEL 6.x Beta packages - no guarantees here but it > should be possible, maybe. To use OCFS2, GFS2 or CLVM with corosync one needs support for userspace cluster stack in DLM, which is missing from EL5 kernel, so this would not help. Backporting that feature doesn't seem possible. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
Re: [Pacemaker] error installing CentOS clvm after using clusterlabs repository
Hi Mike, In RHEL 5.x and CentOS 5.x you must use CMAN and the RedHat Cluster Suite (RHCS) if you are going to used clustered LVM. This is because clvmd currently uses the CMAN interface to the cluster. In later versions, RedHat is moving towards Corosync / OpenAIS / (Pacemaker | RgManager) solution but this will take a long time. Christine Caufield (from RedHat) wrote an excellent document describing the change process here: http://people.redhat.com/ccaulfie/docs/Whither%20cman.pdf I guess your options are: (a) Switch to RHCS based cluster, at least for those nodes with clustered LVM requirements (and GFS, GFS2 etc) (b) Switch to RHEL 6.x Beta (c) Try recompiling RHEL 6.x Beta packages - no guarantees here but it should be possible, maybe. (d) Try compiling current source of lvm2-cluster packages from Fedora or Rawhide as they can use current versions of OpenAIS. The RHEL 5.x versions of lvm2-cluster are fixed at using CMAN interface, not OpenAIS (e) Switch to Debian based distro - Lenny is production ready and has CLVM / Pacemaker / Corosync in backports ;) (f) Something someone else on the list with more experience comes up with :) Good luck, please let us know how you get on. Best Regards, Brett On Tue, 2010-08-03 at 10:29 +0800, Michael Fung wrote: > Hi all, > > > I am using the following repository to install pacemaker and corosync: > > [clusterlabs] > name=High Availability/Clustering server technologies (epel-5) > baseurl=http://www.clusterlabs.org/rpm/epel-5 > ... > > The cluster is working good. > > Later, I want to use clvm, that is the lvm2-cluster package. yum get it > from the CentOS repository, but the dependencies are broken. It seems > the openais package from clusterlabs is different from the CentOS. > > I skipped the dependencies and force installed related library files. > Finally I got clvmd to run but it complains: > > Starting clvmd: clvmd could not connect to cluster manager > Consult syslog for more information > > > Any ideas please? > > > Rgds, > Michael > > > -- Best Regards, Brett Delle Grazie __ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email __ ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker