Re: [Pacemaker] [PATCH] pingd checks pidfile on start
Hi Andrew > Any chance you could redo this as a github pull request? :-D Thanks for your reply. I sent a pull request. Regards, Takatoshi MATSUO 2012/3/29 Andrew Beekhof : > Any chance you could redo this as a github pull request? :-D > > On Wed, Mar 14, 2012 at 6:49 PM, Takatoshi MATSUO > wrote: >> Hi >> >> I use pacemaker 1.0.11 and pingd RA. >> Occasionally, pingd's first monitor is failed after start. >> >> It seems that the main cause is pingd daemon returns 0 before creating >> pidfile >> and RA doesn't check pidfile on start. >> >> test script >> - >> while true; do >> killall pingd; sleep 3 >> rm -f /tmp/pingd.pid; sleep 1 >> /usr/lib64/heartbeat/pingd -D -p /tmp/pingd.pid -a ping_status -d >> 0 -m 100 -h 192.168.0.1 >> echo $? >> ls /tmp/pingd.pid; sleep .1 >> ls /tmp/pingd.pid >> done >> - >> >> result >> - >> 0 >> /tmp/pingd.pid >> /tmp/pingd.pid >> 0 >> ls: cannot access /tmp/pingd.pid: No such file or directory <- NG >> /tmp/pingd.pid >> 0 >> /tmp/pingd.pid >> /tmp/pingd.pid >> 0 >> /tmp/pingd.pid >> /tmp/pingd.pid >> 0 >> /tmp/pingd.pid >> /tmp/pingd.pid >> 0 >> ls: cannot access /tmp/pingd.pid: No such file or directory <- NG >> /tmp/pingd.pid >> -- >> >> Please consider the attached patch for pacemaker-1.0. >> >> Regards, >> Takatoshi MATSUO >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org >> > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CIB not saved
Normally we log an error at startup if we can't write there... did this not happen? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Ies, it happened. I saw a warning while writing CIB..but after I wrote in this mailing list :) Regards -- Fiorenza Meini Spazio Web S.r.l. V. Dante Alighieri, 10 - 13900 Biella Tel.: 015.2431982 - 015.9526066 Fax: 015.2522600 Reg. Imprese, CF e P.I.: 02414430021 Iscr. REA: BI - 188936 Iscr. CCIAA: Biella - 188936 Cap. Soc.: 30.000,00 Euro i.v. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Nodes not rejoining cluster
Gotta have logs. From all 3 nodes mentioned. Only then can we determine if the problem is at the corosync or pacemaker layer - which is the pre-requisit for figuring out what to do next :) On Fri, Mar 30, 2012 at 1:30 PM, Gregg Stock wrote: > I had a circuit breaker go out and take two of the 5 nodes in my cluster > down. Now that their back up and running, they are not rejoining the > cluster. > > Here is what I get from crm_mon -1 > > node 1,2 and 3 itchy, scratchy and walter show the following: > > Last updated: Thu Mar 29 19:04:05 2012 > Last change: Thu Mar 29 19:04:03 2012 via cibadmin on walter > Stack: openais > Current DC: walter - partition with quorum > Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558 > 5 Nodes configured, 5 expected votes > 9 Resources configured. > > > Online: [ itchy scratchy walter butthead timmy ] > > > On butthead I get > > > Last updated: Thu Mar 29 19:04:24 2012 > Last change: Thu Mar 29 18:42:09 2012 via cibadmin on itchy > Stack: openais > Current DC: NONE > 5 Nodes configured, 5 expected votes > 9 Resources configured. > > > OFFLINE: [ itchy scratchy walter butthead timmy ] > > > On Timmy, I get > > > Last updated: Thu Mar 29 19:04:20 2012 > Last change: > Current DC: NONE > 0 Nodes configured, unknown expected votes > 0 Resources configured. > > > > I don't have anything important running yet. so I can do a full clean up of > everything if needed. > > I also get some weird behavior with timmy. I brought this node up with the > host name as timmy.example.com and I changed the host name to timmy but when > the cluster is offline timmy.example.com shows up as offline. I enter crm > node delete timmy.example.com and it goes away until timmy goes offline > again. > > Thanks, > Gregg Stock > > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Nodes not rejoining cluster
I had a circuit breaker go out and take two of the 5 nodes in my cluster down. Now that their back up and running, they are not rejoining the cluster. Here is what I get from crm_mon -1 node 1,2 and 3 itchy, scratchy and walter show the following: Last updated: Thu Mar 29 19:04:05 2012 Last change: Thu Mar 29 19:04:03 2012 via cibadmin on walter Stack: openais Current DC: walter - partition with quorum Version: 1.1.6-3.el6-a02c0f19a00c1eb2527ad38f146ebc0834814558 5 Nodes configured, 5 expected votes 9 Resources configured. Online: [ itchy scratchy walter butthead timmy ] On butthead I get Last updated: Thu Mar 29 19:04:24 2012 Last change: Thu Mar 29 18:42:09 2012 via cibadmin on itchy Stack: openais Current DC: NONE 5 Nodes configured, 5 expected votes 9 Resources configured. OFFLINE: [ itchy scratchy walter butthead timmy ] On Timmy, I get Last updated: Thu Mar 29 19:04:20 2012 Last change: Current DC: NONE 0 Nodes configured, unknown expected votes 0 Resources configured. I don't have anything important running yet. so I can do a full clean up of everything if needed. I also get some weird behavior with timmy. I brought this node up with the host name as timmy.example.com and I changed the host name to timmy but when the cluster is offline timmy.example.com shows up as offline. I enter crm node delete timmy.example.com and it goes away until timmy goes offline again. Thanks, Gregg Stock ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [Problem] The cluster fails in the stop of the node.
Hi Andrew, > This appears to be resolved with 1.1.7, perhaps look for a patch to backport? I confirm movement of Pacemaker 1.1.7. And I talk about the backporting with Mr Mori. Best Regards, Hideo Yamauchi. --- On Thu, 2012/3/29, Andrew Beekhof wrote: > This appears to be resolved with 1.1.7, perhaps look for a patch to backport? > > On Tue, Mar 27, 2012 at 4:46 PM, wrote: > > Hi All, > > > > When we set a group resource within Master/Slave resource, we found the > > problem that a node could not stop. > > > > This problem occurs in Pacemaker1.0.11. > > > > We confirmed a problem in the following procedure. > > > > Step1) Start all nodes. > > > > > > Last updated: Tue Mar 27 14:35:16 2012 > > Stack: Heartbeat > > Current DC: test2 (b645c456-af78-429e-a40a-279ed063b97d) - partition > > WITHOUT quorum > > Version: 1.0.12-unknown > > 2 Nodes configured, unknown expected votes > > 4 Resources configured. > > > > > > Online: [ test1 test2 ] > > > > Master/Slave Set: msGroup01 > > Masters: [ test1 ] > > Slaves: [ test2 ] > > Resource Group: testGroup > > prmDummy1 (ocf::pacemaker:Dummy): Started test1 > > prmDummy2 (ocf::pacemaker:Dummy): Started test1 > > Resource Group: grpStonith1 > > prmStonithN1 (stonith:external/ssh): Started test2 > > Resource Group: grpStonith2 > > prmStonithN2 (stonith:external/ssh): Started test1 > > > > Migration summary: > > * Node test2: > > * Node test1: > > > > Step2) Stop Slave node. > > > > [root@test2 ~]# service heartbeat stop > > Stopping High-Availability services: Done. > > > > Step3) Stop Master node. However, a loop does the Master node and does not > > stop. > > > > (snip) > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: run_graph: Transition 3 > > (Complete=7, Pending=0, Fired=0, Skipped=0, Incomplete=23, > > Source=/var/lib/pengine/pe-input-3.bz2): Terminated > > Mar 27 14:38:06 test1 crmd: [21443]: ERROR: te_graph_trigger: Transition > > failed: terminated > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_graph: Graph 3 (30 actions > > in 30 synapses): batch-limit=30 jobs, network-delay=6ms > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_graph: Synapse 0 is > > pending (priority: 0) > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem: [Action 12]: > > Pending (id: testMsGroup01:0_stop_0, type: pseduo, priority: 0) > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem: * [Input 14]: > > Completed (id: testMsGroup01:0_demote_0, type: pseduo, priority: 0) > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem: * [Input 32]: > > Pending (id: msGroup01_stop_0, type: pseduo, priority: 0) > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_graph: Synapse 1 is > > pending (priority: 0) > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem: [Action 13]: > > Pending (id: testMsGroup01:0_stopped_0, type: pseduo, priority: 0) > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem: * [Input 8]: > > Pending (id: prmStateful1:0_stop_0, loc: test1, priority: 0) > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem: * [Input 9]: > > Pending (id: prmStateful2:0_stop_0, loc: test1, priority: 0) > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_elem: * [Input 12]: > > Pending (id: testMsGroup01:0_stop_0, type: pseduo, priority: 0) > > Mar 27 14:38:06 test1 crmd: [21443]: WARN: print_graph: Synapse 2 was > > confirmed (priority: 0) > > (snip) > > > > I attach data of hb_report. > > > > Best Regards, > > Hideo Yamauchi. > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] [Patch]Patch for crmd-transition-delay processing.
Hi Andrew, Thank you for comment. > The patch makes sense, could you resend as a github pull request? :-D All right!! I send it if ready. Please wait Best Regards, Hideo Yamauchi. --- On Thu, 2012/3/29, Andrew Beekhof wrote: > The patch makes sense, could you resend as a github pull request? :-D > > On Thu, Mar 22, 2012 at 8:18 PM, wrote: > > Hi All, > > > > Sorry > > > > My patch was wrong. > > I send a right patch. > > > > Best Regards, > > Hideo Yamauchi. > > > > --- On Thu, 2012/3/22, renayama19661...@ybb.ne.jp > > wrote: > > > >> Hi All, > >> > >> The crmd-transition-delay waits for the update of the attribute to be late. > >> > >> However, crmd cannot realize the wait of the attribute well because a > >> timer is not reset when the delay of the attribute occurs after a timer > >> was set. > >> > >> As a result, the resource may not be placed definitely. > >> > >> I wrote a patch for Pacemaker 1.0.12. > >> > >> And this patch blocks the handling of tengine when a crmd-transition-delay > >> timer is set. > >> And tengine handles instructions of pengine after a crmd-transition-delay > >> timer exercised it definitely. > >> > >> > >> By this patch, the start of the resource may be late. > >> However, it realizes the placement of a right resource depending on > >> limitation. > >> > >> * I think that the similar correction is necessary for a development > >> version of Pacemaker. > >> > >> Best Regards, > >> Hideo Yamauchi. > > > > ___ > > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > > > Project Home: http://www.clusterlabs.org > > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > > Bugs: http://bugs.clusterlabs.org > > > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Issue with ordering
On Thu, Mar 29, 2012 at 7:07 PM, Vladislav Bogdanov wrote: > Hi Andrew, all, > > I'm continuing experiments with lustre on stacked drbd, and see > following problem: > > I have one drbd resource (ms-drbd-testfs-mdt) is stacked on top of > other (ms-drbd-testfs-mdt-left), and have following constraints > between them: > > colocation drbd-testfs-mdt-with-drbd-testfs-mdt-left inf: > ms-drbd-testfs-mdt ms-drbd-testfs-mdt-left:Master > order drbd-testfs-mdt-after-drbd-testfs-mdt-left inf: > ms-drbd-testfs-mdt-left:promote ms-drbd-testfs-mdt:start > > Then I have filesystem mounted on top of ms-drbd-testfs-mdt > (testfs-mdt resource). > > colocation testfs-mdt-with-drbd-testfs-mdt inf: testfs-mdt > ms-drbd-testfs-mdt:Master > order testfs-mdt-after-drbd-testfs-mdt inf: > ms-drbd-testfs-mdt:promote testfs-mdt:start > > When I trigger event which causes many resources to stop (including > these three), LogActions output look like: > > LogActions: Stop drbd-local#011(lustre01-left) > LogActions: Stop drbd-stacked#011(Started lustre02-left) > LogActions: Stop drbd-testfs-local#011(Started lustre03-left) > LogActions: Stop drbd-testfs-stacked#011(Started lustre04-left) > LogActions: Stop lustre#011(Started lustre04-left) > LogActions: Stop mgs#011(Started lustre01-left) > LogActions: Stop testfs#011(Started lustre03-left) > LogActions: Stop testfs-mdt#011(Started lustre01-left) > LogActions: Stop testfs-ost#011(Started lustre01-left) > LogActions: Stop testfs-ost0001#011(Started lustre02-left) > LogActions: Stop testfs-ost0002#011(Started lustre03-left) > LogActions: Stop testfs-ost0003#011(Started lustre04-left) > LogActions: Stop drbd-mgs:0#011(Master lustre01-left) > LogActions: Stop drbd-mgs:1#011(Slave lustre02-left) > LogActions: Stop drbd-testfs-mdt:0#011(Master lustre01-left) > LogActions: Stop drbd-testfs-mdt-left:0#011(Master lustre01-left) > LogActions: Stop drbd-testfs-mdt-left:1#011(Slave lustre02-left) > LogActions: Stop drbd-testfs-ost:0#011(Master lustre01-left) > LogActions: Stop drbd-testfs-ost-left:0#011(Master lustre01-left) > LogActions: Stop drbd-testfs-ost-left:1#011(Slave lustre02-left) > LogActions: Stop drbd-testfs-ost0001:0#011(Master lustre02-left) > LogActions: Stop drbd-testfs-ost0001-left:0#011(Master lustre02-left) > LogActions: Stop drbd-testfs-ost0001-left:1#011(Slave lustre01-left) > LogActions: Stop drbd-testfs-ost0002:0#011(Master lustre03-left) > LogActions: Stop drbd-testfs-ost0002-left:0#011(Master lustre03-left) > LogActions: Stop drbd-testfs-ost0002-left:1#011(Slave lustre04-left) > LogActions: Stop drbd-testfs-ost0003:0#011(Master lustre04-left) > LogActions: Stop drbd-testfs-ost0003-left:0#011(Master lustre04-left) > LogActions: Stop drbd-testfs-ost0003-left:1#011(Slave lustre03-left) > > For some reason demote is not run on both mdt drbd esources (should > it?), so drbd RA prints warning about that. So its not just a logging error, the demote really isn't scheduled? That would be bad, can you file a bug please? > > What I see then is that ms-drbd-testfs-mdt-left is tried to stop > before ms-drbd-testfs-mdt. > > More, testfs-mdt filesystem resource is not stopped before stopping > drbd-testfs-mdt. > > I have advisory ordering constraints between mdt and ost filesystem > resources, so all ost's are stopped before mdt. Thus mdt stop is delayed > a bit. May be this influences what happens. > > I'm pretty sure I have correct constraints for at least these three > resources, so it looks like a bug, because mandatory ordering is not > preserved. > > I can produce report for this. > > Best, > Vladislav > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CIB not saved
On Thu, Mar 29, 2012 at 8:45 PM, Fiorenza Meini wrote: > Il 29/03/2012 10:12, Rasto Levrinc ha scritto: > >> On Thu, Mar 29, 2012 at 9:54 AM, Fiorenza Meini wrote: >>> >>> Hi there, >>> a strange thing happened to my two node cluster: I rebooted both machine >>> at >>> the same time, when s.o. went up again, no resources were configured >>> anymore: as it was a fresh installation. Why ? >>> It was explained to me that the configuration of resources managed by >>> pacemaker should be in a file called cib.xml, but cannot find it in the >>> system. Have I to specify any particular option in the configuration >>> file? >> >> >> Normally you shouldn't worry about it. cib.xml is stored in >> /var/lib/heartbeat/crm/ or similar and the directory should have have >> hacluster:haclient permissions. What distro is it and how did you install >> it? >> >> Rasto >> > > Thanks, it was a permission problems. Normally we log an error at startup if we can't write there... did this not happen? ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] OCF_RESKEY_CRM_meta_{ordered,notify,interleave}
On Fri, Mar 30, 2012 at 1:47 AM, Florian Haas wrote: > Lars (lmb), or Andrew -- maybe one of you remembers what this was all about. > > In this commit, Lars enabled the > OCF_RESKEY_CRM_meta_{ordered,notify,interleave} attributes to be > injected into the environment of RAs: > https://github.com/ClusterLabs/pacemaker/commit/b0ba01f61086f073be69db3e6beb0914642f79d9 > > Then that change was almost immediately backed out: > https://github.com/ClusterLabs/pacemaker/commit/b33d3bf5376ab59baa435086c803b9fdaf6de504 Because it was felt that RAs shouldn't need to know. Those options change pacemaker's behaviour, not the RAs. But subsequently, in lf#2391, you convinced us to add notify since it allowed the drbd agent to error out if they were not turned on. > > And since then, at some point evidently only interleave and notify > made it back in. Any specific reason for omitting ordered? I happen to > have a pretty good use case for an ordered-clone RA, and it would be > handy to be able to test whether clone ordering has been enabled. I'd need more information. The RA shouldn't need to care I would have thought. The ordering happens in the PE/crmd, the RA should just do what its told. > > All insights are much appreciated. > > Cheers, > Florian > > -- > Need help with High Availability? > http://www.hastexo.com/now > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] OCF_RESKEY_CRM_meta_{ordered,notify,interleave}
Lars (lmb), or Andrew -- maybe one of you remembers what this was all about. In this commit, Lars enabled the OCF_RESKEY_CRM_meta_{ordered,notify,interleave} attributes to be injected into the environment of RAs: https://github.com/ClusterLabs/pacemaker/commit/b0ba01f61086f073be69db3e6beb0914642f79d9 Then that change was almost immediately backed out: https://github.com/ClusterLabs/pacemaker/commit/b33d3bf5376ab59baa435086c803b9fdaf6de504 And since then, at some point evidently only interleave and notify made it back in. Any specific reason for omitting ordered? I happen to have a pretty good use case for an ordered-clone RA, and it would be handy to be able to test whether clone ordering has been enabled. All insights are much appreciated. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] VirtualDomain Shutdown Timeout
Hi Andrew, Thanks, that sounds good. I am using the Ubuntu HA ppa, so I will wait for a 1.1.7 package to become available. Andrew - Original Message - From: "Andrew Beekhof" To: "The Pacemaker cluster resource manager" Sent: Thursday, March 29, 2012 1:08:21 AM Subject: Re: [Pacemaker] VirtualDomain Shutdown Timeout On Sun, Mar 25, 2012 at 6:27 AM, Andrew Martin wrote: > Hello, > > I have configured a KVM virtual machine primitive using Pacemaker 1.1.6 and > Heartbeat 3.0.5 on Ubuntu 10.04 Server using DRBD as the storage device (so > there is no shared storage, no live-migration): > primitive p_vm ocf:heartbeat:VirtualDomain \ > params config="/vmstore/config/vm.xml" \ > meta allow-migrate="false" \ > op start interval="0" timeout="180s" \ > op stop interval="0" timeout="120s" \ > op monitor interval="10" timeout="30" > > I would expect the following events to happen on failover on the "from" node > (the migration source) if the VM hangs while shutting down: > 1. VirtualDomain issues "virsh shutdown vm" to gracefully shutdown the VM > 2. pacemaker waits 120 seconds for the timeout specified in the "op stop" > timeout > 3. VirtualDomain waits a bit less than 120 seconds to see if it will > gracefully shutdown. Once it gets to almost 120 seconds, it issues "virsh > destroy vm" to hard stop the VM. > 4. pacemaker wakes up from the 120 second timeout and sees that the VM has > stopped and proceeds with the failover > > However, I observed that VirtualDomain seems to be using the timeout from > the "op start" line, 180 seconds, yet pacemaker uses the 120 second timeout. > Thus, the VM is still running after the pacemaker timeout is reached and so > the node is STONITHed. Here is the relevant section of code from > /usr/lib/ocf/resource.d/heartbeat/VirtualDomain: > VirtualDomain_Stop() { > local i > local status > local shutdown_timeout > local out ex > > VirtualDomain_Status > status=$? > > case $status in > $OCF_SUCCESS) > if ! ocf_is_true $OCF_RESKEY_force_stop; then > # Issue a graceful shutdown request > ocf_log info "Issuing graceful shutdown request for domain > ${DOMAIN_NAME}." > virsh $VIRSH_OPTIONS shutdown ${DOMAIN_NAME} > # The "shutdown_timeout" we use here is the operation > # timeout specified in the CIB, minus 5 seconds > shutdown_timeout=$(( $NOW + > ($OCF_RESKEY_CRM_meta_timeout/1000) -5 )) > # Loop on status until we reach $shutdown_timeout > while [ $NOW -lt $shutdown_timeout ]; do > > Doesn't $OCF_RESKEY_CRM_meta_timeout correspond to the timeout value in the > "op stop ..." line? It should, however there was a bug in 1.1.6 where this wasn't the case. The relevant patch is: https://github.com/beekhof/pacemaker/commit/fcfe6fe Or you could try 1.1.7 > > How can I optimize my pacemaker configuration so that the VM will attempt to > gracefully shutdown and then at worst case destroy the VM before the > pacemaker timeout is reached? Moreover, is there anything I can do inside of > the VM (another Ubuntu 10.04 install) to optimize/speed up the shutdown > process? > > Thanks, > > Andrew > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Issue with ordering
On Thu, Mar 29, 2012 at 11:40 AM, Vladislav Bogdanov wrote: > Hi Florian, > > 29.03.2012 11:54, Florian Haas wrote: >> On Thu, Mar 29, 2012 at 10:07 AM, Vladislav Bogdanov >> wrote: >>> Hi Andrew, all, >>> >>> I'm continuing experiments with lustre on stacked drbd, and see >>> following problem: >> >> At the risk of going off topic, can you explain *why* you want to do >> this? If you need a distributed, replicated filesystem with >> asynchronous replication capability (the latter presumably for DR), >> why not use a Distributed-Replicated GlusterFS volume with >> geo-replication? > > I need fast POSIX fs scalable to tens of petabytes with support for > fallocate() and friends to prevent fragmentation. > > I generally agree with Linus about FUSE and userspace filesystems in > general, so that is not an option. I generally agree with Linus and just about everyone else that filesystems shouldn't require invasive core kernel patches. But I digress. :) > Using any API except what VFS provides via syscalls+glibc is not an > option too because I need access to files from various scripted > languages including shell and directly from a web server written in C. > Having bindings for them all is a real overkill. And it all is in > userspace again. > > So I generally have choice of CEPH, Lustre, GPFS and PVFS. > > CEPH is still very alpha, so I can't rely on it, although I keep my eye > on it. > > GPFS is not an option because it is not free and produced by IBM (can't > say which of these two is more important ;) ) > > Can't remember why exactly PVFS is a no-go, their site is down right > now. Probably userspace server implementation (although some examples > like nfs server discredit idea of in-kernel servers, I still believe > this is a way to go). Ceph is 100% userspace server side, jftr. :) And it has no async replication capability at this point, which you seem to be after. > Lustre is widely deployed, predictable and stable. It fully runs in > kernel space. Although Oracle did its best to bury Lustre development, > it is actively developed by whamcloud and company. They have builds for > EL6, so I'm pretty happy with this. Lustre doesn't have any replication > built-in so I need to add it on a lower layer (no rsync, no rsync, no > rsync ;) ). DRBD suits my needs for a simple HA. > > But I also need datacenter-level HA, that's why I evaluate stacked DRBD > and tickets with booth. > > So, frankly speaking, I decided to go with Lustre not because it is so > cool (it has many-many niceties), but because all others I know do not > suit my needs at all due to various reasons. > > Hope this clarifies my point, It does. Doesn't necessarily mean I agree, but the point you're making is fine. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Pacemaker + Oracle
cat /etc/oratab And maybe you can post your log :-) Il giorno 29 marzo 2012 13:53, Ruwan Fernando ha scritto: > Hi, > I'm working with Pacemaker Active Passive Cluster and need to use oracle > as a resource to the pacemaker. my resource script is > crm configureprimitive Oracle ocf:heartbeat:oracle params sid=OracleDB op > monitor inetrval=120s > but it is not worked for me. > > Can someone help out on this matter? > > Regards, > Ruwan > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > > -- esta es mi vida e me la vivo hasta que dios quiera ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Pacemaker + Oracle
Hi, I'm working with Pacemaker Active Passive Cluster and need to use oracle as a resource to the pacemaker. my resource script is crm configureprimitive Oracle ocf:heartbeat:oracle params sid=OracleDB op monitor inetrval=120s but it is not worked for me. Can someone help out on this matter? Regards, Ruwan ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CIB not saved
Il 29/03/2012 10:12, Rasto Levrinc ha scritto: On Thu, Mar 29, 2012 at 9:54 AM, Fiorenza Meini wrote: Hi there, a strange thing happened to my two node cluster: I rebooted both machine at the same time, when s.o. went up again, no resources were configured anymore: as it was a fresh installation. Why ? It was explained to me that the configuration of resources managed by pacemaker should be in a file called cib.xml, but cannot find it in the system. Have I to specify any particular option in the configuration file? Normally you shouldn't worry about it. cib.xml is stored in /var/lib/heartbeat/crm/ or similar and the directory should have have hacluster:haclient permissions. What distro is it and how did you install it? Rasto Thanks, it was a permission problems. Regards -- Fiorenza Meini Spazio Web S.r.l. V. Dante Alighieri, 10 - 13900 Biella Tel.: 015.2431982 - 015.9526066 Fax: 015.2522600 Reg. Imprese, CF e P.I.: 02414430021 Iscr. REA: BI - 188936 Iscr. CCIAA: Biella - 188936 Cap. Soc.: 30.000,00 Euro i.v. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Issue with ordering
Hi Florian, 29.03.2012 11:54, Florian Haas wrote: > On Thu, Mar 29, 2012 at 10:07 AM, Vladislav Bogdanov > wrote: >> Hi Andrew, all, >> >> I'm continuing experiments with lustre on stacked drbd, and see >> following problem: > > At the risk of going off topic, can you explain *why* you want to do > this? If you need a distributed, replicated filesystem with > asynchronous replication capability (the latter presumably for DR), > why not use a Distributed-Replicated GlusterFS volume with > geo-replication? I need fast POSIX fs scalable to tens of petabytes with support for fallocate() and friends to prevent fragmentation. I generally agree with Linus about FUSE and userspace filesystems in general, so that is not an option. Using any API except what VFS provides via syscalls+glibc is not an option too because I need access to files from various scripted languages including shell and directly from a web server written in C. Having bindings for them all is a real overkill. And it all is in userspace again. So I generally have choice of CEPH, Lustre, GPFS and PVFS. CEPH is still very alpha, so I can't rely on it, although I keep my eye on it. GPFS is not an option because it is not free and produced by IBM (can't say which of these two is more important ;) ) Can't remember why exactly PVFS is a no-go, their site is down right now. Probably userspace server implementation (although some examples like nfs server discredit idea of in-kernel servers, I still believe this is a way to go). Lustre is widely deployed, predictable and stable. It fully runs in kernel space. Although Oracle did its best to bury Lustre development, it is actively developed by whamcloud and company. They have builds for EL6, so I'm pretty happy with this. Lustre doesn't have any replication built-in so I need to add it on a lower layer (no rsync, no rsync, no rsync ;) ). DRBD suits my needs for a simple HA. But I also need datacenter-level HA, that's why I evaluate stacked DRBD and tickets with booth. So, frankly speaking, I decided to go with Lustre not because it is so cool (it has many-many niceties), but because all others I know do not suit my needs at all due to various reasons. Hope this clarifies my point, Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Issue with ordering
On Thu, Mar 29, 2012 at 10:07 AM, Vladislav Bogdanov wrote: > Hi Andrew, all, > > I'm continuing experiments with lustre on stacked drbd, and see > following problem: At the risk of going off topic, can you explain *why* you want to do this? If you need a distributed, replicated filesystem with asynchronous replication capability (the latter presumably for DR), why not use a Distributed-Replicated GlusterFS volume with geo-replication? Note that I know next to nothing about your actual detailed requirements, so GlusterFS may well be non-ideal for you and my suggestion may thus be moot, but it would be nice if you could explain why you're doing this. Cheers, Florian -- Need help with High Availability? http://www.hastexo.com/now ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] CIB not saved
On Thu, Mar 29, 2012 at 9:54 AM, Fiorenza Meini wrote: > Hi there, > a strange thing happened to my two node cluster: I rebooted both machine at > the same time, when s.o. went up again, no resources were configured > anymore: as it was a fresh installation. Why ? > It was explained to me that the configuration of resources managed by > pacemaker should be in a file called cib.xml, but cannot find it in the > system. Have I to specify any particular option in the configuration file? Normally you shouldn't worry about it. cib.xml is stored in /var/lib/heartbeat/crm/ or similar and the directory should have have hacluster:haclient permissions. What distro is it and how did you install it? Rasto -- Dipl.-Ing. Rastislav Levrinc rasto.levr...@gmail.com Linux Cluster Management Console http://lcmc.sf.net/ ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] Issue with ordering
Hi Andrew, all, I'm continuing experiments with lustre on stacked drbd, and see following problem: I have one drbd resource (ms-drbd-testfs-mdt) is stacked on top of other (ms-drbd-testfs-mdt-left), and have following constraints between them: colocation drbd-testfs-mdt-with-drbd-testfs-mdt-left inf: ms-drbd-testfs-mdt ms-drbd-testfs-mdt-left:Master order drbd-testfs-mdt-after-drbd-testfs-mdt-left inf: ms-drbd-testfs-mdt-left:promote ms-drbd-testfs-mdt:start Then I have filesystem mounted on top of ms-drbd-testfs-mdt (testfs-mdt resource). colocation testfs-mdt-with-drbd-testfs-mdt inf: testfs-mdt ms-drbd-testfs-mdt:Master order testfs-mdt-after-drbd-testfs-mdt inf: ms-drbd-testfs-mdt:promote testfs-mdt:start When I trigger event which causes many resources to stop (including these three), LogActions output look like: LogActions: Stopdrbd-local#011(lustre01-left) LogActions: Stopdrbd-stacked#011(Started lustre02-left) LogActions: Stopdrbd-testfs-local#011(Started lustre03-left) LogActions: Stopdrbd-testfs-stacked#011(Started lustre04-left) LogActions: Stoplustre#011(Started lustre04-left) LogActions: Stopmgs#011(Started lustre01-left) LogActions: Stoptestfs#011(Started lustre03-left) LogActions: Stoptestfs-mdt#011(Started lustre01-left) LogActions: Stoptestfs-ost#011(Started lustre01-left) LogActions: Stoptestfs-ost0001#011(Started lustre02-left) LogActions: Stoptestfs-ost0002#011(Started lustre03-left) LogActions: Stoptestfs-ost0003#011(Started lustre04-left) LogActions: Stopdrbd-mgs:0#011(Master lustre01-left) LogActions: Stopdrbd-mgs:1#011(Slave lustre02-left) LogActions: Stopdrbd-testfs-mdt:0#011(Master lustre01-left) LogActions: Stopdrbd-testfs-mdt-left:0#011(Master lustre01-left) LogActions: Stopdrbd-testfs-mdt-left:1#011(Slave lustre02-left) LogActions: Stopdrbd-testfs-ost:0#011(Master lustre01-left) LogActions: Stopdrbd-testfs-ost-left:0#011(Master lustre01-left) LogActions: Stopdrbd-testfs-ost-left:1#011(Slave lustre02-left) LogActions: Stopdrbd-testfs-ost0001:0#011(Master lustre02-left) LogActions: Stopdrbd-testfs-ost0001-left:0#011(Master lustre02-left) LogActions: Stopdrbd-testfs-ost0001-left:1#011(Slave lustre01-left) LogActions: Stopdrbd-testfs-ost0002:0#011(Master lustre03-left) LogActions: Stopdrbd-testfs-ost0002-left:0#011(Master lustre03-left) LogActions: Stopdrbd-testfs-ost0002-left:1#011(Slave lustre04-left) LogActions: Stopdrbd-testfs-ost0003:0#011(Master lustre04-left) LogActions: Stopdrbd-testfs-ost0003-left:0#011(Master lustre04-left) LogActions: Stopdrbd-testfs-ost0003-left:1#011(Slave lustre03-left) For some reason demote is not run on both mdt drbd esources (should it?), so drbd RA prints warning about that. What I see then is that ms-drbd-testfs-mdt-left is tried to stop before ms-drbd-testfs-mdt. More, testfs-mdt filesystem resource is not stopped before stopping drbd-testfs-mdt. I have advisory ordering constraints between mdt and ost filesystem resources, so all ost's are stopped before mdt. Thus mdt stop is delayed a bit. May be this influences what happens. I'm pretty sure I have correct constraints for at least these three resources, so it looks like a bug, because mandatory ordering is not preserved. I can produce report for this. Best, Vladislav ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
[Pacemaker] CIB not saved
Hi there, a strange thing happened to my two node cluster: I rebooted both machine at the same time, when s.o. went up again, no resources were configured anymore: as it was a fresh installation. Why ? It was explained to me that the configuration of resources managed by pacemaker should be in a file called cib.xml, but cannot find it in the system. Have I to specify any particular option in the configuration file? Thanks and regards -- Fiorenza Meini Spazio Web S.r.l. V. Dante Alighieri, 10 - 13900 Biella Tel.: 015.2431982 - 015.9526066 Fax: 015.2522600 Reg. Imprese, CF e P.I.: 02414430021 Iscr. REA: BI - 188936 Iscr. CCIAA: Biella - 188936 Cap. Soc.: 30.000,00 Euro i.v. ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Migration of "lower" resource causes dependent resources to restart
29.03.2012 10:07, Andrew Beekhof wrote: > On Thu, Mar 29, 2012 at 5:43 PM, Vladislav Bogdanov > wrote: >> 29.03.2012 09:35, Andrew Beekhof wrote: >>> On Thu, Mar 29, 2012 at 5:28 PM, Vladislav Bogdanov >>> wrote: Hi Andrew, all, Pacemaker restarts resources when resource they depend on (ordering only, no colocation) is migrated. I mean that when I do crm resource migrate lustre, I get LogActions: Migrate lustre#011(Started lustre03-left -> lustre04-left) LogActions: Restart mgs#011(Started lustre01-left) I only have one ordering constraint for these two resources: order mgs-after-lustre inf: lustre:start mgs:start This reminds me what have been with reload in a past (dependent resource restart when "lower" resource is reloaded). Shouldn't this be changed? Migration usually means that service is not interrupted... >>> >>> Is that strictly true? Always? >> >> This probably depends on implementation. >> With qemu live migration - yes. > > So there will be no point at which, for example, pinging the VM's ip > address fails? Even all existing connections are preserved. Small delays during last migration phase are still possible, but they are minor (during around 100-200 milliseconds while context is switching and ip is announced from another node). And packets are not lost, just delayed a bit. I have corosync/pacemaker udpu clusters in VMs, and even corosync is happy when VM it runs on is migrating to another node (with some token tuning). > >> With pacemaker:Dummy (with meta allow-migrate="true") probably yes too... >> >>> My understanding was although A thinks the migration happens >>> instantaneously, it is in fact more likely to be pause+migrate+resume >>> and during that time anyone trying to talk to A during that time is >>> going to be disappointed. >> >> >>> >>> ___ >>> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >>> >>> Project Home: http://www.clusterlabs.org >>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >>> Bugs: http://bugs.clusterlabs.org >> >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
Re: [Pacemaker] Migration of "lower" resource causes dependent resources to restart
On Thu, Mar 29, 2012 at 5:43 PM, Vladislav Bogdanov wrote: > 29.03.2012 09:35, Andrew Beekhof wrote: >> On Thu, Mar 29, 2012 at 5:28 PM, Vladislav Bogdanov >> wrote: >>> Hi Andrew, all, >>> >>> Pacemaker restarts resources when resource they depend on (ordering >>> only, no colocation) is migrated. >>> >>> I mean that when I do crm resource migrate lustre, I get >>> >>> LogActions: Migrate lustre#011(Started lustre03-left -> lustre04-left) >>> LogActions: Restart mgs#011(Started lustre01-left) >>> >>> I only have one ordering constraint for these two resources: >>> >>> order mgs-after-lustre inf: lustre:start mgs:start >>> >>> This reminds me what have been with reload in a past (dependent resource >>> restart when "lower" resource is reloaded). >>> >>> Shouldn't this be changed? Migration usually means that service is not >>> interrupted... >> >> Is that strictly true? Always? > > This probably depends on implementation. > With qemu live migration - yes. So there will be no point at which, for example, pinging the VM's ip address fails? > With pacemaker:Dummy (with meta allow-migrate="true") probably yes too... > >> My understanding was although A thinks the migration happens >> instantaneously, it is in fact more likely to be pause+migrate+resume >> and during that time anyone trying to talk to A during that time is >> going to be disappointed. > > >> >> ___ >> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org >> http://oss.clusterlabs.org/mailman/listinfo/pacemaker >> >> Project Home: http://www.clusterlabs.org >> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf >> Bugs: http://bugs.clusterlabs.org > > > ___ > Pacemaker mailing list: Pacemaker@oss.clusterlabs.org > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org ___ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org