Re: [Linux-HA] Fencing : pb about 'dynamic-list'

2011-01-19 Thread Andrew Beekhof
On Wed, Jan 19, 2011 at 9:34 AM, Alain.Moulle wrote: > Hi, > > What I don't understand is the fact that same configuration of resource > to fence > was working in older releases ... There is a new stonith daemon that doesn't work like the old one. This is a RHEL system right? Why aren't you usin

Re: [Linux-HA] Unordered groups (was Re: Is 'resource_set' still experimental?)

2011-01-19 Thread Andrew Beekhof
On Tue, Jan 18, 2011 at 1:42 PM, Florian Haas wrote: > On 01/18/2011 11:49 AM, RaSca wrote: >> As discussed yesterday on IRC with Andrew, there is no way of creating a >> group with indipendent resources. >> I was hoping that setting the options you mentioned can do the trick, >> but I've just tes

Re: [Linux-HA] Fencing : pb about "dynamic-list"

2011-01-18 Thread Andrew Beekhof
The agent doesn't appear to support the list command (try stonith -l for that agent), so pacemaker is unable to determine which machines the device can kill. You may need to specify the hosts manually. On Thu, Jan 13, 2011 at 11:36 AM, Alain.Moulle wrote: > Hi, > > I have a pb of fencing not wor

Re: [Linux-HA] segfault problem

2011-01-18 Thread Andrew Beekhof
Fixed upstream. Presumably SUSE will pick it up in their next update. On Tue, Jan 18, 2011 at 1:04 PM, Haussecker, Armin wrote: > Hi, > > in our 2-node-cluster (SLES11 SP1) with pacemaker 1.1.2 - 0.7.1 we got the > following segfault: > > Jan 17 12:24:19 goat1 sudo: clusteradm : TTY=pts/1 ; PWD

Re: [Linux-HA] no quorum problem

2011-01-18 Thread Andrew Beekhof
Not sure though. > > Thanks, > > Dejan > >> Thanks in advance >> >> Pavlos Polianidis >> >> -Original Message- >> From: linux-ha-boun...@lists.linux-ha.org >> [mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Andrew Beekhof >&

Re: [Linux-HA] no quorum problem

2011-01-18 Thread Andrew Beekhof
..@lists.linux-ha.org > [mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Andrew Beekhof > Sent: Tuesday, January 18, 2011 2:03 PM > To: General Linux-HA mailing list > Subject: Re: [Linux-HA] no quorum problem > > On Tue, Jan 18, 2011 at 1:00 PM, Pavlos Polianidis > wrote:

Re: [Linux-HA] no quorum problem

2011-01-18 Thread Andrew Beekhof
69]: debug: > G_remove_client(pid=23010, reason='signoff' gsource=0x8836a00) { > > > I would attach the full logs but they are too large :) > > > Thanks in advance > > Pavlos Polianidis > > > Pavlos Polianidis | Technical Support Specialist > > V

Re: [Linux-HA] Is 'resource_set' still experimental?

2011-01-18 Thread Andrew Beekhof
On Tue, Jan 4, 2011 at 11:56 AM, Tobias Appel wrote: > On 12/28/2010 06:46 PM, Dejan Muhamedagic wrote: > >> >> 40 order constraints? A big cluster. >> > > We have currently 40 VM's (XEN) on it. I can't put them in a group since > they have to run independently and not necessarily on the same node

Re: [Linux-HA] no quorum problem

2011-01-18 Thread Andrew Beekhof
On Thu, Jan 13, 2011 at 3:17 PM, Pavlos Polianidis wrote: > Hello, > > > Currently I have installed heartbeat 3.0.2-2.el5 x86_64 and pacemaker > 1.0.7-4.el5 x86_64 on a CentOS release 5.3 x86_64 machine using yum > repositories. > > My configuration is the below: > Ha.cf > > debugfile /var/log/h

Re: [Linux-HA] Failover occurs even though ping score is a tie

2011-01-18 Thread Andrew Beekhof
Please don't use 2.0.8 Get a recent version of pacemaker and you'll likely not have this problem On Thu, Jan 13, 2011 at 6:33 PM, Brad Johnson wrote: > We are running heartbeat version 2.0.8 and I am wondering why this > happens and how to prevent it: > One of the ping nodes is on our LAN and whe

Re: [Linux-HA] Need help please stopping unnecessary resource failovers

2011-01-18 Thread Andrew Beekhof
For starters you could try not using 2.0.8 On Fri, Jan 14, 2011 at 9:48 PM, Brad Johnson wrote: > Running heartbeat 2.0.8 I have a problem with unnecessary moving of our > resource to a different node when a pingd node goes dead or comes back > up. I am talking about the ping node itself going de

Re: [Linux-HA] Heartbeat and order of execution

2011-01-18 Thread Andrew Beekhof
create a resource for the other script and then use an regular ordering constraint to have it start before the VIP On Thu, Jan 13, 2011 at 7:13 PM, wrote: > Sorry if this is a silly question. I've been reading the docs and I'm > a little confused. > > I have a situation where I want heartbeat to

Re: [Linux-HA] Fencing : pb about 'dynamic-list'

2011-01-18 Thread Andrew Beekhof
-S isn't terribly relevant, what -l shows is what counts. On Mon, Jan 17, 2011 at 2:19 PM, Alain.Moulle wrote: > Hi Dejan, > > Yes  stonith -t external/ipmi ... -S works fine : > /usr/sbin/stonith -t external/ipmi hostname=node2 ipaddr=' ' > userid='mylogin' passwd='mypass' interface='lan'  -S >

Re: [Linux-HA] Option 3 : corosync + cpg + cman + mcp

2011-01-17 Thread Andrew Beekhof
Did you make sure to use different values for "nodename:" on both nodes? Its an easy cut&paste error to make. Otherwise it looks pretty sane. What do the logs say? On Thu, Dec 16, 2010 at 1:46 PM, Alain.Moulle wrote: > Hi, > > I'm trying to make working the Option 3, but it does not start . > >

Re: [Linux-HA] Question about limits around resources

2011-01-17 Thread Andrew Beekhof
On Mon, Dec 13, 2010 at 10:46 AM, Alain.Moulle wrote: > Hi Andrew, > > Currently, my nodes are being reinstalled with RHEL6 GA, so as soon as > possible > I'll execute the same tests , but with the GA releases so : > pacemaker-1.1.2-7.el6 > corosync-1.2.3-21.el6.x86_64 > and by the way, I'll test

Re: [Linux-HA] Issues when running Heartbeat on FreeBSD 8.1 RELEASE

2011-01-17 Thread Andrew Beekhof
On Fri, Dec 10, 2010 at 4:26 PM, Kevin Mai wrote: > Hi folks, > > I'm trying to build a failover solution using FreeBSD 8.1-RELEASE and > Heartbeat from ports (v2.1.4-10). > > I've already configured heartbeat in the two peers, but once I start the > daemon using the /usr/local/etc/rc.d/heartbea

Re: [Linux-HA] Pacemaker & AWS elastic IPs

2011-01-17 Thread Andrew Beekhof
On Wed, Dec 15, 2010 at 9:11 PM, Andrew Miklas wrote: > Hi, > > On 26-Nov-10, at 1:41 AM, Andrew Beekhof wrote: > >>> The problem here is that these spurious node failures cause Pacemaker >>> to initiate unnecessary resource migrations.  Is it normal for the >&g

Re: [Linux-HA] Problem with Heartbeat 3.0 + Pacemaker 1.1

2011-01-11 Thread Andrew Beekhof
t; box. They normally do. Evidently I messed up and no-one noticed for a while. > > Thanks, > > Avestan :drunk: > > > > Andrew Beekhof-3 wrote: >> >> On Tue, Jan 11, 2011 at 8:15 AM, Andrew Beekhof >> wrote: >>> On Mon, Jan 10, 2011 at 7:05 PM

Re: [Linux-HA] Problem with Heartbeat 3.0 + Pacemaker 1.1

2011-01-10 Thread Andrew Beekhof
On Tue, Jan 11, 2011 at 8:15 AM, Andrew Beekhof wrote: > On Mon, Jan 10, 2011 at 7:05 PM, Avestan wrote: >> >> Hello Dejan, >> >> Thank you for taking the time to look into this.  In regard with the using >> null modem serial cable and the bandwidth, it doesn&#x

Re: [Linux-HA] Problem with Heartbeat 3.0 + Pacemaker 1.1

2011-01-10 Thread Andrew Beekhof
On Mon, Jan 10, 2011 at 7:05 PM, Avestan wrote: > > Hello Dejan, > > Thank you for taking the time to look into this.  In regard with the using > null modem serial cable and the bandwidth, it doesn't seem to be problematic > running Heartbeat R1 but I will take your advice and slide in a second >

Re: [Linux-HA] Force elect a new DC?

2010-12-24 Thread Andrew Beekhof
On Thu, Dec 23, 2010 at 3:05 PM, Brad Johnson wrote: > I know we are running an old version, and we plan on eventually > upgrading, but we can't upgrade now. So for now I am looking for any way > to avoid this unacceptably long fail-over delay by making sure the > active node is not the DC. The D

Re: [Linux-HA] Force elect a new DC?

2010-12-23 Thread Andrew Beekhof
On Wed, Dec 22, 2010 at 6:56 PM, Brad Johnson wrote: > We are running heartbeat version 2.0.8. The scenario is when the active > device (device currently running our resource) is also the Designated > Controller. If that active device goes down there is an additional delay > of 40 seconds beyond t

Re: [Linux-HA] Question about limits around resources .

2010-12-16 Thread Andrew Beekhof
On Thu, Dec 16, 2010 at 8:55 AM, Alain.Moulle wrote: > Hi, > > I just wanted to execute the same test with el6 GA releases, meaning : > corosync-1.2.3-21.el6.x86_64 > pacemaker-1.1.2-7.el6.x86_64 > > Good news , it seems much more stable , Even better news, this version doesn't include the perfor

Re: [Linux-HA] Are there any Linux alternatives to drbd and heartbeat?

2010-12-10 Thread Andrew Beekhof
On Fri, Dec 10, 2010 at 6:52 PM, Les Mikesell wrote: > On 12/10/2010 11:30 AM, Andrew Beekhof wrote: >> >>>> No-one is suggesting all clusters should run on Fedora. I was clearly >>>> trying to say that instructions for A are unlikely to work unmodified &

Re: [Linux-HA] Are there any Linux alternatives to drbd and heartbeat?

2010-12-10 Thread Andrew Beekhof
On Fri, Dec 10, 2010 at 4:42 PM, Les Mikesell wrote: > On 12/10/2010 9:27 AM, Andrew Beekhof wrote: >> On Fri, Dec 10, 2010 at 2:53 PM, Les Mikesell  wrote: >>> On 12/10/10 2:20 AM, Andrew Beekhof wrote: >>>> >>>>> See "LRM operation We

Re: [Linux-HA] Are there any Linux alternatives to drbd and heartbeat?

2010-12-10 Thread Andrew Beekhof
On Fri, Dec 10, 2010 at 2:53 PM, Les Mikesell wrote: > On 12/10/10 2:20 AM, Andrew Beekhof wrote: >> >>> See "LRM operation WebSite_start_0 unknown error" from November, that's >>> where your pdf led me. By the time I hit "unknown error"

Re: [Linux-HA] Can no longer start/stop heartbeat properly

2010-12-10 Thread Andrew Beekhof
Oh, and 2.1.4??? Unless you're on SLES10, please update to a recent Pacemaker version. Not that this will solve this particular problem, you'll just be happier with the result. On Thu, Dec 9, 2010 at 3:16 PM, Bart Pousson wrote: > Hi, > > I have a system with two nodes that had been running heart

Re: [Linux-HA] Can no longer start/stop heartbeat properly

2010-12-10 Thread Andrew Beekhof
On Thu, Dec 9, 2010 at 6:48 PM, Bart Pousson wrote: > Thanks for the response, > > I did do a Google search on both logs before posting to this mailing list.   > This is what has been tried so far: > >   1. Several times the service was stopped and started using >      */etc/init.d/heartbeat*, but

Re: [Linux-HA] Question about limits around resources .

2010-12-10 Thread Andrew Beekhof
On Thu, Dec 9, 2010 at 2:11 PM, Alain.Moulle wrote: > Hi, > > Thanks. > So I have a robustness pb with Pacemaker/corosync ... you'll tell me > if it seems normal or not , if I miss something or not : Perfectly valid testcase, unacceptable result. Perhaps try with stonith-enabled=false so we can

Re: [Linux-HA] Are there any Linux alternatives to drbd and heartbeat?

2010-12-10 Thread Andrew Beekhof
On Thu, Dec 9, 2010 at 10:53 PM, Bart Coninckx wrote: > On Thursday 09 December 2010 22:21:57 Pavlos Parissis wrote: >> On 9 December 2010 17:09, Igor Chudov wrote: >> > On Thu, Dec 9, 2010 at 9:31 AM, Dimitri Maziuk > wrote: >> >> See "LRM operation WebSite_start_0 unknown error" from November,

Re: [Linux-HA] Are there any Linux alternatives to drbd and heartbeat?

2010-12-10 Thread Andrew Beekhof
On Thu, Dec 9, 2010 at 4:31 PM, Dimitri Maziuk wrote: > On 12/9/2010 4:05 AM, Andrew Beekhof wrote: >> On Wed, Dec 8, 2010 at 8:39 PM, Igor Chudov  wrote: >>> On Wed, Dec 8, 2010 at 1:32 PM, Serge Dubrouski  wrote: >>>> Taking into account "simple&qu

Re: [Linux-HA] nodes offline or pending

2010-12-09 Thread Andrew Beekhof
On Thu, Dec 9, 2010 at 12:15 PM, Bart Coninckx wrote: > On Wednesday 08 December 2010 12:30:09 Andrew Beekhof wrote: >> On Mon, Dec 6, 2010 at 10:30 PM, Bart Coninckx > wrote: >> > Hi, >> > >> > just finished setting up a two-node cluster with >> >

Re: [Linux-HA] Different on-fail actions for recurring monitor?

2010-12-09 Thread Andrew Beekhof
On Wed, Dec 1, 2010 at 10:08 AM, Andrew Miklas wrote: > Hi, > > I'm curious how the "on-fail" attribute of a recurring monitor > operation works.  From my testing, it seems that a recurring monitor > is considered to have failed any time its return doesn't match what > the cluster believes it shou

Re: [Linux-HA] How to set the resource-failure-stickiness

2010-12-09 Thread Andrew Beekhof
On Mon, Dec 6, 2010 at 9:09 AM, Bin Chen(sunwen_ling) wrote: > Hi guys, > > I searched from internet that there is an option named > resource-failure-stickiness but I can't apply it to a primitive, can you > take a look? That setting is no longer used in 1.0 Check out the documentation for (defau

Re: [Linux-HA] Are there any Linux alternatives to drbd and heartbeat?

2010-12-09 Thread Andrew Beekhof
On Wed, Dec 8, 2010 at 8:39 PM, Igor Chudov wrote: > On Wed, Dec 8, 2010 at 1:32 PM, Serge Dubrouski wrote: >> Taking into account "simple" the answer is no. You can try RedHat >> Cluster Suite on CentOS, but that's not simple. >> >> What's wrong with DRBD/Pacemaker/Corosync ? > > DRBD/Pacemaker

Re: [Linux-HA] Are there any Linux alternatives to drbd and heartbeat?

2010-12-09 Thread Andrew Beekhof
On Wed, Dec 8, 2010 at 10:36 PM, James Smith wrote: > I've spent the last several months learning drbd, pacemaker etc ... drbd > itself is surprisingly simple to get up running.  Im yet to experience > significant problems with it. > > Pacemaker has documentation, but I've certainly found it a t

Re: [Linux-HA] Question about limits around resources .

2010-12-09 Thread Andrew Beekhof
On Thu, Dec 9, 2010 at 9:06 AM, Alain.Moulle wrote: > Hi, > > I wonder if there are some limits with Pacemaker in terms of : > > nb of resources (primitives, groups, clones etc.) in the whole HA > Cluster with 2 or 4 nodes ? > > nb of resources (primitives, groups, clones etc.) per node in a HA Cl

Re: [Linux-HA] resource is in (unmanaged) state, what I should do?

2010-12-08 Thread Andrew Beekhof
On Wed, Dec 8, 2010 at 12:42 PM, Bin Chen(sunwen_ling) wrote: > On Wed, Dec 8, 2010 at 7:28 PM, Andrew Beekhof wrote: > >> On Wed, Dec 8, 2010 at 7:42 AM, Bin Chen(sunwen_ling) >> wrote: >> > Hi guys, >> > >> > My resource some times goes

Re: [Linux-HA] Option 3 : corosync + cpg + cman + mcp

2010-12-08 Thread Andrew Beekhof
On Wed, Dec 8, 2010 at 3:13 PM, Alain.Moulle wrote: > Hi Andrew, > > no because I have no lsb script pacemaker installed on RHEL6 with > my package : pacemaker-1.1.2-2 > > Does that mean that to configure Option 3, there is a pacemaker release > minimum > that is more recent then the 1.1.2-2  ? Y

Re: [Linux-HA] nodes offline or pending

2010-12-08 Thread Andrew Beekhof
On Mon, Dec 6, 2010 at 10:30 PM, Bart Coninckx wrote: > Hi, > > just finished setting up a two-node cluster with pacemaker-1.0.1-20.3.x86_64 > and openais-0.80.3-26.2.x86_64 (OpenSuse 11.2). 1.0.1??? Please get something newer from clusterlabs.org/rpm > I seem to have quite irradicate (or so I p

Re: [Linux-HA] resource is in (unmanaged) state, what I should do?

2010-12-08 Thread Andrew Beekhof
On Wed, Dec 8, 2010 at 7:42 AM, Bin Chen(sunwen_ling) wrote: > Hi guys, > > My resource some times goes into (unmanaged) state, what I should do if i > encounter this? I'm guessing it failed to stop and you didn't have stonith configured. But you didn't include anything that would allow us to com

Re: [Linux-HA] pacemaker with dual primary drbd resource

2010-12-08 Thread Andrew Beekhof
0.9.1+hg15626-1~bpo50+1  HA > cluster resource manager > > > Oliver > > On Wed, Dec 8, 2010 at 3:41 PM, Andrew Beekhof wrote: >> >> On Wed, Dec 8, 2010 at 8:03 AM, Linux Cook wrote: >> > Hi! >> > >> > Is this mean I need to enable stonith? &g

Re: [Linux-HA] pacemaker with dual primary drbd resource

2010-12-07 Thread Andrew Beekhof
On Wed, Dec 8, 2010 at 8:03 AM, Linux Cook wrote: > Hi! > > Is this mean I need to enable stonith? No, it means your install is broken. Which version of pacemaker? Heartbeat or corosync? > > On Wed, Dec 8, 2010 at 12:13 PM, Linux Cook wrote: > >> I've followed the steps but I'm having issues r

Re: [Linux-HA] Option 3 : corosync + cpg + cman + mcp

2010-12-07 Thread Andrew Beekhof
On Tue, Dec 7, 2010 at 9:29 AM, Alain.Moulle wrote: > Hi, > > I'm trying to make a configuration with Option 3, to avoid problems with > corosync I had > with Option 0 (corosync + pacemaker plugin v0) . By the way, I tried the > option 2 but > corosync did not start, but as the future is option 4

Re: [Linux-HA] Resource appears to be active on two nodes

2010-12-02 Thread Andrew Beekhof
On Thu, Dec 2, 2010 at 9:41 AM, Preeti Jain wrote: > Andrew Beekhof beekhof.net> writes: > >> > but still which version i should move, can i do it with >> > heartbeat 2.1.4 without pacemaker >> > >> >> no, 2.1.4 was never supported > > then w

Re: [Linux-HA] Resource appears to be active on two nodes

2010-12-01 Thread Andrew Beekhof
On Thu, Dec 2, 2010 at 6:47 AM, bharat khandelwal wrote: > >> > then which version should i use to get this task done, >> > i am using suse 10.x86_64 >> >> Hmm, that seems quite old too, probably not supported anymore. >> You should update everything. >> >> Thanks, >> >> Dejan > > thanks for reply

Re: [Linux-HA] Resource appears to be active on two nodes

2010-12-01 Thread Andrew Beekhof
On Wed, Dec 1, 2010 at 12:42 PM, bharat khandelwal wrote: > >> > achieve this task. >> >  So, i need help with this version of heartbeat only without pacemaker. >> >  As i am architecture follows: >> >  Heartbeat version : 2.0.5 >> >> There's nobody using this version anymore for a very long time.

Re: [Linux-HA] custom jboss init script on pacemaker

2010-11-30 Thread Andrew Beekhof
On Wed, Dec 1, 2010 at 5:05 AM, Linux Cook wrote: > Thank you for your inputs but I have my own customed init script but it is > using my jboss engine. > > If I command, /etc/init.d/ start/stop, its working. > > I added into my configuration like: > > primitive lsb: \ >        op monitor interval

Re: [Linux-HA] confused in two node heartbeat cluster

2010-11-30 Thread Andrew Beekhof
On Tue, Nov 30, 2010 at 8:19 PM, Mia Lueng wrote: > And since hbtest02 is fenced , why does hbtest01 not take over the resource? http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch05s03.html#id637389 > > > 2010/12/1 Dimitri Maziuk > >> Mia Lueng wrote: >> >> > Are t

Re: [Linux-HA] custom jboss init script on pacemaker

2010-11-30 Thread Andrew Beekhof
http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/s-resource-lsb.html On Tue, Nov 30, 2010 at 11:11 AM, Linux Cook wrote: > Hi > > How can I run jboss init script on pacemaker? Which resource should I use? > > thank you! > > Linux cook > _

Re: [Linux-HA] pacemaker error on postgresql drbd

2010-11-30 Thread Andrew Beekhof
On Tue, Nov 30, 2010 at 9:38 AM, Linux Cook wrote: > hi! > > Whats wrong with my configuration? No idea. Perhaps check out the new howto for postgres: http://www.clusterlabs.org/wiki/PostgresHowto > DBIP: 10.110.10.5 > drbd resource: postgres > postgres service: postgresql > node1: dmcs1 > nod

Re: [Linux-HA] Pacemaker use of std dlm_controld versusdlm_controld.pcmk

2010-11-30 Thread Andrew Beekhof
On Tue, Nov 30, 2010 at 10:20 AM, Alain.Moulle wrote: > Hi Andrew, > > the context ?  it was the response to your question : >> Probably means its too old then. Where did you get it? So this is from one of the RHEL6 betas? The GA version should support it.

Re: [Linux-HA] Pacemaker use of std dlm_controld versusdlm_controld.pcmk

2010-11-29 Thread Andrew Beekhof
context? On Mon, Nov 29, 2010 at 9:01 AM, Alain.Moulle wrote: > # rpm -qf /usr/sbin/crm_report > pacemaker-1.1.2-2.el6.x86_64 > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: h

Re: [Linux-HA] PATCH: Sysinfo RA

2010-11-26 Thread Andrew Beekhof
On Fri, Nov 26, 2010 at 11:53 AM, Dejan Muhamedagic wrote: > Hi, > > On Fri, Nov 26, 2010 at 11:13:35AM +0100, Andrew Beekhof wrote: >> On Fri, Nov 19, 2010 at 4:29 PM, Matthew Richardson >> wrote: >> > Please find attached a patch to the pacemaker SysInfo RA.

Re: [Linux-HA] 3 Node Cluster

2010-11-26 Thread Andrew Beekhof
On Fri, Nov 19, 2010 at 3:16 PM, Frank Lazzarini wrote: > Hi all, > > so I've been playing arround a little bit with setting up a 3 node > cluster, and all seems to work fine when I do the start of the drbd > resources manually. So basically here is my setup, in this current setup > I don't use a

Re: [Linux-HA] HealthSMART RA re-written

2010-11-26 Thread Andrew Beekhof
On Thu, Nov 18, 2010 at 4:06 PM, Matthew Richardson wrote: > I've been playing with the existing HealthSMART RA in Pacemaker and have > discovered a number of fundamental bugs and errors with it that mean it > will never have worked for anyone. > > I've done a big overhaul of this RA, replacing mo

Re: [Linux-HA] PATCH: Sysinfo RA

2010-11-26 Thread Andrew Beekhof
On Fri, Nov 19, 2010 at 4:29 PM, Matthew Richardson wrote: > Please find attached a patch to the pacemaker SysInfo RA. I like the idea, but we probably need to keep the use of expr ( instead of $(()) ) for compatibility with non-bash systems. > This patch adds 2 new features: > > - Allow a list

Re: [Linux-HA] Pacemaker & AWS elastic IPs

2010-11-26 Thread Andrew Beekhof
On Fri, Nov 26, 2010 at 9:36 AM, Andrew Miklas wrote: > Hi, > > On 25-Nov-10, at 11:37 AM, Andrew Beekhof wrote: > >> Given what you've described, you could probably remove the while loop >> during stop. >> It should be safe because Amazon is ensuring that i

Re: [Linux-HA] Pacemaker use of std dlm_controld versus dlm_controld.pcmk

2010-11-25 Thread Andrew Beekhof
New thread? Probably means its too old then. Where did you get it? On Wed, Nov 24, 2010 at 8:32 AM, Alain.Moulle wrote: > Hi, > > # crm_report --features > pcmk_report: unrecognized option '--features' > nodename:     ERROR: Not sure what to do, no tests or times to extract > > and : (no --featu

Re: [Linux-HA] Pacemaker & AWS elastic IPs

2010-11-25 Thread Andrew Beekhof
On Thu, Nov 25, 2010 at 10:22 AM, Andrew Miklas wrote: > Hi, > > On 23-Nov-10, at 3:52 AM, Andrew Beekhof wrote: > >>> Another question -- is it possible to define resources that do not >>> have stop actions?  On AWS, there is no need to explicitly stop an >>&

Re: [Linux-HA] How to make an node not electable.

2010-11-25 Thread Andrew Beekhof
On Thu, Nov 25, 2010 at 4:06 PM, Lars Ellenberg wrote: > On Thu, Nov 25, 2010 at 03:56:26PM +0100, Michael Schwartzkopff wrote: >> On Thursday 25 November 2010 15:51:32 Henrique Fernandes wrote: >> > If it is in standby it  still part of cluster and take decisions on quorum >> > and etc ? >> >> ye

Re: [Linux-HA] Pacemaker use of std dlm_controld versus dlm_controld.pcmk

2010-11-23 Thread Andrew Beekhof
On Tue, Nov 23, 2010 at 4:40 PM, Alain.Moulle wrote: > Hi, > > I've some issues in dlm with Pacemaker and ocfs2-pcmk, > and someone from RH told me : > >> Are you using dlm_controld.pcmk? If so, please try the latest versions of >> pacemaker that use the standard dlm_controld > I would just like t

Re: [Linux-HA] Linux heartbeat: which one becomes master?

2010-11-23 Thread Andrew Beekhof
On Tue, Nov 23, 2010 at 7:21 AM, Bin Chen(sunwen_ling) wrote: > Hi guys, > > I am newbie here, maybe a silly question. > > Suppose I confiugred 2 machines to be active/passive with the linux > heartbeat and pacemaker, then I create a resource group in node1, commit it. > If I am correct the config

Re: [Linux-HA] Pacemaker & AWS elastic IPs

2010-11-23 Thread Andrew Beekhof
On Mon, Nov 22, 2010 at 11:13 PM, Andrew Miklas wrote: > Hi, > > On 20-Nov-10, at 12:47 AM, Andrew Beekhof wrote: > >> What do you think you gain by not increasing the timeout? >> We don't sit around doing nothing if it completes in only a fraction >> of the all

Re: [Linux-HA] Pacemaker & AWS elastic IPs

2010-11-20 Thread Andrew Beekhof
On Sat, Nov 20, 2010 at 8:02 AM, Andrew Miklas wrote: > Hi all, > > I'm trying to use Pacemaker on a Amazon Web Services' EC2 to > automatically reassign elastic IPs (Amazon's equivalent to floating or > virtual IPs) in the event of a node failure.  The setup I'm testing > with is two elastic IPs

Re: [Linux-HA] Problem with colocation and M/S resource

2010-11-14 Thread Andrew Beekhof
On Mon, Oct 25, 2010 at 11:52 AM, Marek Marczykowski wrote: > Hi, > > I have setup with M/S mysql resources. > The goal is to have one IP address on MySQL master and another for slave > (if any). When there is only (one) master - slave IP should be on the > master. > > My setup produces some weird

Re: [Linux-HA] Professional Support

2010-11-09 Thread Andrew Beekhof
On Tue, Nov 9, 2010 at 6:41 AM, Eric Schoeller wrote: > Hello, > > Before we roll out a pacemaker cluster I was tasked with identifying > ways to provide professional support for it. My team has a history with > Sun Cluster, and with that they're used to 24x7 phone support. I know > that LinBit of

Re: [Linux-HA] How to use STONITH plugins external/vmware?

2010-10-25 Thread Andrew Beekhof
On Fri, Oct 22, 2010 at 3:20 AM, Dika Ye wrote: > Dear All, > > > > I am new here; I want to know did some body know how to configure the > STONITH plugins external/vmware? I wrote it a long time ago. It used to work but may not anymore depending how ESX has changed since then. This would be a g

Re: [Linux-HA] 10000.00us average ?

2010-10-25 Thread Andrew Beekhof
On Mon, Oct 25, 2010 at 10:30 AM, Frank Lazzarini wrote: > Hi all, > > just a random question, something that I never really understood in the > log files of pacemaker was the message when it says something like > > info: cib_stats: Processed 1 operations (1.00us average, 0% > utilization) in

Re: [Linux-HA] heartbeat with postgresql

2010-10-23 Thread Andrew Beekhof
On Fri, Oct 22, 2010 at 7:32 PM, Greg Woods wrote: > On Fri, 2010-10-22 at 18:32 +0200, Andrew Beekhof wrote: >> if you're just using v1 - thats not a cluster, >> thats a prayer. > > Then God must answer my prayers, because I have been using some simple > heartbeat v1

Re: [Linux-HA] heartbeat with postgresql

2010-10-23 Thread Andrew Beekhof
On Fri, Oct 22, 2010 at 7:23 PM, Dimitri Maziuk wrote: > Andrew Beekhof wrote: > > OK, I'll post this and shut up. > >> Or are you really trying to claim that: >> >>    linuxha1 IPaddr::192.168.85.3 httpd smb >> >> is fundamentally less complex

Re: [Linux-HA] heartbeat with postgresql

2010-10-22 Thread Andrew Beekhof
On Wed, Oct 20, 2010 at 7:09 PM, Greg Woods wrote: > On Wed, 2010-10-20 at 08:13 +0200, Andrew Beekhof wrote: > >> > Um, maybe because heartbeat v1 has a much much much much less steep >> > learning curve? >> >> I dispute that: >> >>     >&g

Re: [Linux-HA] heartbeat with postgresql

2010-10-22 Thread Andrew Beekhof
On Wed, Oct 20, 2010 at 4:43 PM, Dimitri Maziuk wrote: > Andrew Beekhof wrote: >> On Tue, Oct 19, 2010 at 6:44 PM, Greg Woods wrote: >>> On Tue, 2010-10-19 at 10:01 -0600, Serge Dubrouski wrote: >>>> Any particular reason for using Heartbeat v1 instead of CRM/Pa

Re: [Linux-HA] heartbeat with postgresql

2010-10-19 Thread Andrew Beekhof
On Tue, Oct 19, 2010 at 6:44 PM, Greg Woods wrote: > On Tue, 2010-10-19 at 10:01 -0600, Serge Dubrouski wrote: >> Any particular reason for using Heartbeat v1 instead of CRM/Pacemaker? > > Um, maybe because heartbeat v1 has a much much much much less steep > learning curve? I dispute that: h

Re: [Linux-HA] Problem around stonith resources IP targets with fence_ipmilan

2010-10-19 Thread Andrew Beekhof
It's because you're using PCMK_host_check=none Only use that if every device can fence every host. I don't have access at the moment, but iirc you need to tell each device which nodes it can kill with PCMK_host_list Sent from my iPad On 12 Oct 2010, at 17:20, "Alain.Moulle" wrote: > ( Wit

Re: [Linux-HA] Stonith log entries

2010-10-19 Thread Andrew Beekhof
Sounds like stonthd isn't starting. Are these the standard f13 packages? Are you using them with heartbeat or corosync? Sent from my iPad On 14 Oct 2010, at 02:29, mike wrote: > Fedora 13 on i686 btw. > > On 10-10-13 09:26 PM, mike wrote: >> Hi all, >> >> I've started building a simple 2 nod

Re: [Linux-HA] Documentation of heartbeat protocol

2010-10-14 Thread Andrew Beekhof
On Thu, Oct 14, 2010 at 4:57 PM, Vadym Chepkov wrote: > On Thu, Oct 14, 2010 at 10:13 AM, Andrew Beekhof wrote: >> On Thu, Oct 14, 2010 at 2:23 PM, Vadym Chepkov wrote: >>> >>> On Oct 14, 2010, at 4:41 AM, Lars Ellenberg wrote: >>>> >>>> If

Re: [Linux-HA] Handling colocation constraints with more than 2 entries

2010-10-14 Thread Andrew Beekhof
On Thu, Oct 14, 2010 at 4:24 PM, Lars Marowsky-Bree wrote: > On 2010-10-08T15:08:48, Andrew Beekhof wrote: > >> >> but it doesn't address the original problem that >> >> the shell syntax for colocation constraints switches direction when >> >>

Re: [Linux-HA] Documentation of heartbeat protocol

2010-10-14 Thread Andrew Beekhof
On Thu, Oct 14, 2010 at 2:23 PM, Vadym Chepkov wrote: > > On Oct 14, 2010, at 4:41 AM, Lars Ellenberg wrote: >> >> If you happen to be somehow target locked on heartbeat, tell us why, >> and what you are trying to achieve, and we figure something out. > > Sorry for barge in, but I actually started

Re: [Linux-HA] Handling colocation constraints with more than 2 entries

2010-10-10 Thread Andrew Beekhof
On Fri, Oct 8, 2010 at 6:25 PM, Dejan Muhamedagic wrote: >> [1] Note that the XML syntax has no "direction" for colocation >> constraints without sets. > > Well, it does, just coded as different attributes. Or did I > misunderstand? What I meant was that the order in which rsc and with-rsc occur

Re: [Linux-HA] Handling colocation constraints with more than 2 entries

2010-10-08 Thread Andrew Beekhof
On Fri, Oct 8, 2010 at 10:31 AM, Lars Marowsky-Bree wrote: > On 2010-10-08T08:27:45, Andrew Beekhof wrote: > >> I'd be in favor of the join construct above (although I'd probably >> call it "depends"), > > Yes, one of the hardest problem in all of comp

Re: [Linux-HA] Handling colocation constraints with more than 2 entries

2010-10-07 Thread Andrew Beekhof
On Thu, Oct 7, 2010 at 5:49 PM, Lars Marowsky-Bree wrote: > On 2010-10-05T18:03:17, Andrew Beekhof wrote: > >> > Anyway, it's too late to change the semantics as >> > that would change behaviour of the existing clusters. >> Actually the solution is really qu

Re: [Linux-HA] Handling colocation constraints with more than 2 entries

2010-10-06 Thread Andrew Beekhof
On Wed, Oct 6, 2010 at 12:34 PM, Dejan Muhamedagic wrote: > On Tue, Oct 05, 2010 at 06:03:17PM +0200, Andrew Beekhof wrote: >> On Tue, Oct 5, 2010 at 3:47 PM, Dejan Muhamedagic >> wrote: >> > Hi, >> > >> > On Mon, Oct 04, 2010 at 10:36:21PM +0200, Andre

Re: [Linux-HA] Handling colocation constraints with more than 2 entries

2010-10-05 Thread Andrew Beekhof
On Wed, Oct 6, 2010 at 1:19 AM, Andreas Kurz wrote: > On 10/05/2010 06:03 PM, Andrew Beekhof wrote: >> On Tue, Oct 5, 2010 at 3:47 PM, Dejan Muhamedagic >> wrote: >>> Hi, >>> >>> On Mon, Oct 04, 2010 at 10:36:21PM +0200, Andreas Kurz wrote: >>&

Re: [Linux-HA] Handling colocation constraints with more than 2 entries

2010-10-05 Thread Andrew Beekhof
On Tue, Oct 5, 2010 at 3:47 PM, Dejan Muhamedagic wrote: > Hi, > > On Mon, Oct 04, 2010 at 10:36:21PM +0200, Andreas Kurz wrote: >> Hello, >> >> On 10/04/2010 05:36 PM, Dejan Muhamedagic wrote: >> > Hi, >> > >> > On Mon, Oct 04, 2010 at 04:01:50PM +0100, Matthew Richardson wrote: >> > I've been pl

Re: [Linux-HA] Handling colocation constraints with more than 2 entries

2010-10-04 Thread Andrew Beekhof
On Mon, Oct 4, 2010 at 10:36 PM, Andreas Kurz wrote: > Hello, > > On 10/04/2010 05:36 PM, Dejan Muhamedagic wrote: >> Hi, >> >> On Mon, Oct 04, 2010 at 04:01:50PM +0100, Matthew Richardson wrote: >> I've been playing with pacemaker for a while now, and have recently >> seena  user stung by an issu

Re: [Linux-HA] exit code for status when program is stopped, 7 or 3

2010-10-01 Thread Andrew Beekhof
On Fri, Oct 1, 2010 at 10:25 AM, Pavlos Parissis wrote: > Hi, > > I am checking if my script which starts a program is LSB compliant. > So, I followed the steps mentioned here [1] and in one of the steps > says > > Status (stopped): > /etc/init.d/some_service status ; echo "result: $?" > > Did the

Re: [Linux-HA] About error : text2task: Unsupported action: order-nfs-0-stop-end

2010-09-30 Thread Andrew Beekhof
On Thu, Sep 30, 2010 at 3:06 PM, Alain.Moulle wrote: > Hi, > > I'm using Pacemaker release : > pacemaker-1.1.2-2.el6.x86_64 > > I've passed the following crm command : > crm configure order order-nfs mandatory:  clone-fs1 clone-fs2 > clone-fs3   nfs > _ > _and I got this record in cib.xml : > >  

Re: [Linux-HA] About OCFS2 and Pacemaker

2010-09-28 Thread Andrew Beekhof
On Mon, Sep 27, 2010 at 1:09 PM, Alain.Moulle wrote: > Hi > > I've all re-configured with dlm and o2cb as clone in Pacemaker > configuration, > but unfortunately, all works fine on one node, but as soon as the clone o2cb > starts on second node, both nodes crash . > > So I've got the last releases

Re: [Linux-HA] About OCFS2 and Pacemaker

2010-09-22 Thread Andrew Beekhof
On Wed, Sep 22, 2010 at 10:24 AM, Alain.Moulle wrote: > Hi again, > > I've have a look in the external/sdb device, it seems to be useful > only if we don't use a "by network" fencing solution , so as I'm > using fence_ipmilan, perhaps this shared device is not useful > in my case , right ? Correc

Re: [Linux-HA] Configuration question

2010-09-22 Thread Andrew Beekhof
On Tue, Sep 21, 2010 at 1:42 PM, Alain.Moulle wrote: > Hi, > > Suppose I have two stonith primitives configured on my two-nodes cluster, > restofencenode0 and restofencenode1 with : > location +INF for restofencenode0 on node1 > location -INF for restofencenode0 on node0 > location +INF for restof

Re: [Linux-HA] About OCFS2 and Pacemaker

2010-09-22 Thread Andrew Beekhof
On Wed, Sep 22, 2010 at 9:49 AM, Alain.Moulle wrote: > Hi Andrew, > > sorry to come again on this subject, I was about to switch to OCFS2 in > Pacemaker but : > > For configuration , I'm using the documentation "Oracle Cluster File > System 2", > I think it is always correct/valid ? > > It seems t

Re: [Linux-HA] About OCFS2 and Pacemaker

2010-09-21 Thread Andrew Beekhof
On Tue, Sep 21, 2010 at 12:48 PM, Alain.Moulle wrote: >> >> sorry but I don't fully understand : >> > >> > ?- I don't think there is any "fencing" functionnality in the OCFS2 >> > management, >> >> >> there is suicide, but that is unrelated to my point >> > Ok I didn't not know, but this could not

Re: [Linux-HA] About OCFS2 and Pacemaker

2010-09-21 Thread Andrew Beekhof
On Tue, Sep 21, 2010 at 11:13 AM, Alain.Moulle wrote: > Hi Andrew, > > sorry but I don't fully understand : > >  - I don't think there is any "fencing" functionnality in the OCFS2 > management, there is suicide, but that is unrelated to my point >   so for me the membership information remains o

Re: [Linux-HA] Limit amount of resources migrating at the same time

2010-09-21 Thread Andrew Beekhof
On Wed, Sep 15, 2010 at 10:04 AM, Tobias Appel wrote: > Hi all, > > it's been some time since I last worked with Heartbeat. Now a workmate > asked me this question and hopefully I can get a short answer from you guys. > > The problem is that we use Xen on a Heartbeat cluster with a lot of > virtua

Re: [Linux-HA] Don't start resource until the other node is active

2010-09-20 Thread Andrew Beekhof
On Wed, Sep 15, 2010 at 9:13 AM, Jonathan Petersson wrote: > Hi all, > > Is there any argument which you can add using crm to tell the nodes > for the other node to come up before starting a resource? So you dont want rscA to start until nodeB is up? If so, create a resource that can only run on

Re: [Linux-HA] About OCFS2 and Pacemaker

2010-09-20 Thread Andrew Beekhof
On Mon, Sep 20, 2010 at 5:55 PM, Alain.Moulle wrote: > Hi > > I have a "philosophic" question about two nodes with FS under OCFS2 > and Pacemaker/corosync for the HA of both nodes. > > My choice was to let OCFS2 stack out of Pacemaker configuration, > so I let the services o2cb and ocfs2 started a

Re: [Linux-HA] Problem on resource name lentgh

2010-09-20 Thread Andrew Beekhof
On Mon, Sep 20, 2010 at 12:19 PM, Andrew Beekhof wrote: > Actually, looks like I'm wrong here: > > In get_lrm_resource() I see: > >        char rid[64]; > > Which is clearly wrong. http://hg.clusterlabs.org/pacemaker/1.1/rev/7311ce12fd40 > On Mon, Sep 20, 2010 at 11

Re: [Linux-HA] Problem on resource name lentgh

2010-09-20 Thread Andrew Beekhof
Actually, looks like I'm wrong here: In get_lrm_resource() I see: char rid[64]; Which is clearly wrong. On Mon, Sep 20, 2010 at 11:57 AM, Andrew Beekhof wrote: > On Fri, Sep 17, 2010 at 3:58 PM, Alain.Moulle wrote: >> Hi, >> >> just for info : >> it se

Re: [Linux-HA] Problem on resource name lentgh

2010-09-20 Thread Andrew Beekhof
On Fri, Sep 17, 2010 at 3:58 PM, Alain.Moulle wrote: > Hi, > > just for info : > it seems that there is a limit on the number of characters for names in > Pacemaker, > I had two resources names of 67 characters, gathered in a group, and the > resources > remain always ORPHANED , without clear reas

<    2   3   4   5   6   7   8   9   10   11   >