[Pacemaker] DRBD cluster and UPS

2012-10-10 Thread Vladimir Elisseev
Hello, I'm trying to find a good solution to properly shutdown a DRBD (master/slave) cluster in case of outages. The problem, at least for me, is that if you simply shutdown one node, when it comes back in 99% on the cases you'll end up with split brain scenarios. So for me the proper logic in cas

Re: [Pacemaker] Announce: pcs-0.9.26

2012-10-10 Thread Digimer
On 10/08/2012 08:27 PM, Chris Feist wrote: We've been making improvements to the pcs (pacemaker/corosync configuration system) command line tool over the past few months. Currently you can setup a basic cluster (including configuring corosync 2.0 udpu). David Vossel has also created a version o

Re: [Pacemaker] high cib load on config change

2012-10-10 Thread James Harper
> > Questions: > - are you making any config changes when this behaviour is occurring? > - if so, from one node only or many? > - what version is this? 1.1.7 or 1.1.7 plus some debian patches? which > patches? > One other thing I just noticed is that ntp wasn't working on the two nodes used fo

Re: [Pacemaker] FW: Some resources are restarted after a node joins back cluster after failover.

2012-10-10 Thread Andrew Beekhof
On Tue, Oct 2, 2012 at 6:42 PM, Poonam Agarwal wrote: > Hi, > > > > I had sent this message before, do not know how and why it got dropped. Perhaps you weren't subscribed yet? > > I am facing below issue. Can somebody please help? There were some problems in this area in the past. Would you con

Re: [Pacemaker] high cib load on config change

2012-10-10 Thread James Harper
> > I guess I'd first like to know if the log entries I was seeing ("Failed > > application of an update diff" and "Requesting re-sync from peer") means > > that a full resync is being done, and if that's a problem or not. > > There are occasions when its not a problem, but I don't think any of th

Re: [Pacemaker] Pacemaker 1.1.6 order possible bug ?

2012-10-10 Thread Andrew Beekhof
On Mon, Sep 10, 2012 at 10:15 PM, Tomáš Vavřička wrote: > On 09/06/2012 07:41 AM, Tomáš Vavřička wrote: >> >> On 09/05/2012 04:54 PM, Dejan Muhamedagic wrote: >>> >>> Hi, >>> >>> On Wed, Sep 05, 2012 at 03:09:15PM +0200, Tomáš Vavřička wrote: On 09/05/2012 11:44 AM, Dejan Muhamedagic wro

Re: [Pacemaker] [PATCH] Couldn't find src in NULL - missing null check

2012-10-10 Thread Andrew Beekhof
I've merged both patches now. Thanks :) On Sun, Oct 7, 2012 at 3:28 AM, Grüninger, Andreas (LGL Extern) wrote: > I found this in the logfile: > > > Sep 29 19:14:29 [770]cib: notice: log_cib_diff: cib:diff: > Diff: --- 0.11.3 > Sep 29 19:14:29 [770]cib: notice: log_

Re: [Pacemaker] Failed to connection to the cluster

2012-10-10 Thread Andrew Beekhof
Did you actually start pacemaker? On Tue, Sep 25, 2012 at 1:30 AM, 龙龙 wrote: > Hi, > Now,I have installed pacemaker on two nodes(node4 and node2),they works > fine.Then,I installed pacemaker on another node(7)---with the same configure > file.But when I use crm_mon to watch the cluster status,it

Re: [Pacemaker] How to gang multiple APC AP7901 outlets together under the same name

2012-10-10 Thread Andrew Beekhof
Nice guide. You should consider adding it to the wiki. On Thu, Oct 4, 2012 at 12:20 AM, Epps, Josh wrote: > Procedure: How to gang multiple APC AP7901 outlets together under the same > name > > > > Note:Do not include any quote marks. They are used below just to > indicate verbatim strings. >

Re: [Pacemaker] STONIH device and two-active DRBDs with GFS2

2012-10-10 Thread Andrew Beekhof
On Sat, Oct 6, 2012 at 8:28 PM, Tero Mäntyvaara wrote: > Hi, > > I have read the tutorial you provide on your web site but I am > struggling with the node level STONITH device which I currently do not > have. > > I was wondering if I set two-active DRBD and because - by your > tutorial in chapter

Re: [Pacemaker] high cib load on config change

2012-10-10 Thread Andrew Beekhof
On Wed, Oct 10, 2012 at 6:44 PM, James Harper wrote: >> On 10/09/2012 01:42 PM, James Harper wrote: >> > As per previous post, I'm seeing very high cib load whenever I make a >> > configuration change, enough load that things timeout seemingly >> > instantly. I thought this was happening well befo

Re: [Pacemaker] centos 6 fence_apc parameter error

2012-10-10 Thread Andrew Beekhof
You should also be able to do pcmk_host_list=fail1,fail2 to avoid the need for quotes On Thu, Oct 11, 2012 at 1:45 AM, Michael Brennen wrote: > On Wed, 10 Oct 2012, Dejan Muhamedagic wrote: > >> Hi, >> >> On Tue, Oct 09, 2012 at 06:47:30PM -0500, Michael Brennen wrote: >>> >>> Hello all, >>> >>>

Re: [Pacemaker] Exiting corosync-notifyd results in shutting downof pacemakerd

2012-10-10 Thread Andrew Beekhof
On Thu, Oct 4, 2012 at 5:57 PM, Grüninger, Andreas (LGL Extern) wrote: >>> Is this an error or the desired result? > >>Based on the logs, pacemaker thinks corosync died. Did that happen? >>If so there is not much pacemaker can do :-( > > And that is absolutely ok when corosync dies. > Corosync do

[Pacemaker] ClusterLabs.org Documentation Update

2012-10-10 Thread Andrew Beekhof
In addition to some updates for 1.1.8, the documentation at http://www.clusterlabs.org/doc/ now comes in two flavours. Clusters from Scratch (and to a lesser extent Pacemaker Explained) now come in "pcs" and "crmsh" editions. So regardless of which admin tool you're a fan of, we've got you cover

Re: [Pacemaker] anybody using pacemaker w/ ganeti?

2012-10-10 Thread Andrew Beekhof
On Wed, Oct 10, 2012 at 1:12 AM, Miles Fidelman wrote: > Out of less-than-idle curiousity, is anybody using pacemaker on a ganeti > cluster? > > Ganeti sure looks like a nice package for building/managing small clusters, > including setup for DRBD and VM migration - BUT... it does not do > auto-fa

Re: [Pacemaker] centos 6 fence_apc parameter error

2012-10-10 Thread Michael Brennen
On Wed, 10 Oct 2012, Dejan Muhamedagic wrote: Hi, On Tue, Oct 09, 2012 at 06:47:30PM -0500, Michael Brennen wrote: Hello all, I have built a two node apache/mysql cluster, with drbd syncing the two. I am using the centos 6 corosync, pacemaker, and fence-agents packages. (I tried the cluster

Re: [Pacemaker] centos 6 fence_apc parameter error

2012-10-10 Thread Dejan Muhamedagic
Hi, On Tue, Oct 09, 2012 at 06:47:30PM -0500, Michael Brennen wrote: > Hello all, > > I have built a two node apache/mysql cluster, with drbd syncing the > two. I am using the centos 6 corosync, pacemaker, and fence-agents > packages. (I tried the clusterlabs 1.1.8 pacemaker with the pcs > shel

Re: [Pacemaker] The CPU usage is almost 100 % when booth execute crm_ticket command.

2012-10-10 Thread Yuichi SEINO
Hi Jiaju, I reported the bugzilla. http://bugs.clusterlabs.org/show_bug.cgi?id=5110 Sincerely, Yuichi 2012/10/10 Jiaju Zhang : > Hi Yuichi, > > On Wed, 2012-10-10 at 15:21 +0900, Yuichi SEINO wrote: >> Hi Jiaju, >> >> I find a problem. >> I use the latest version of pacemaker. >> >> pacemaker-1

Re: [Pacemaker] Announce: pcs-0.9.26

2012-10-10 Thread Andrew Beekhof
On Tue, Oct 9, 2012 at 11:27 AM, Chris Feist wrote: > We've been making improvements to the pcs (pacemaker/corosync configuration > system) command line tool over the past few months. > > Currently you can setup a basic cluster (including configuring corosync 2.0 > udpu). > > David Vossel has also

Re: [Pacemaker] A pacemaker resource agent for keepalived

2012-10-10 Thread Lars Marowsky-Bree
On 2012-10-10T10:11:09, Owen Le Blanc wrote: > We use keepalived instead of ldirectord for managing our load > balancers, partly for historical reasons, and partly because it has a > number of features which we have had difficulty implementing with > ldirectord. The attached resource agent seems

[Pacemaker] A pacemaker resource agent for keepalived

2012-10-10 Thread Owen Le Blanc
We use keepalived instead of ldirectord for managing our load balancers, partly for historical reasons, and partly because it has a number of features which we have had difficulty implementing with ldirectord. The attached resource agent seems to work well with keepalived under pacemaker 1.1.6.

[Pacemaker] A patch for stonith external/libvirt

2012-10-10 Thread Owen Le Blanc
I attach a patch for the stonith agent external/libvirt. This agent was failing on our machines because for rebooting machines it tried to stop and then start them, which doesn't work on our system, while rebooting them does. We have cluster glue version 1.0.8-2 installed on a Debian system, with

Re: [Pacemaker] high cib load on config change

2012-10-10 Thread James Harper
> On 10/09/2012 01:42 PM, James Harper wrote: > > As per previous post, I'm seeing very high cib load whenever I make a > > configuration change, enough load that things timeout seemingly > > instantly. I thought this was happening well before the configured > > timeout but now I'm not so sure, may