Re: [Linux-HA] LINUX HA with MYSQL Replication

2012-10-02 Thread Andrew Beekhof
On Tue, Oct 2, 2012 at 6:22 AM, Yves Trudeau wrote: > Hi, > Pacemaker is indeed a good solution for this. Apparently Oracle even agrees now according to: http://www.clusterdb.com/mysql-cluster/mysql-now-provides-support-for-drbd/ > Regards, > > Yves > > Le 2012-09-24 09:25, varun116 a écr

Re: [Linux-HA] Unable to restart cluster using resource_failure_stickiness

2012-09-20 Thread Andrew Beekhof
On Tue, Sep 18, 2012 at 9:49 PM, Olivier BONHOMME wrote: > Hello the list, > > New user to the HA world, I am trying to create an active/passive > cluster unfortunately with an heartbeat 2.1.4 version for historical > reasons. Stop. Please do not use those old releases. They're terrible, the numb

Re: [Linux-HA] Monitor operations not running after maintenance-mode=false ??

2012-09-18 Thread Andrew Beekhof
On Tue, Sep 18, 2012 at 9:20 PM, Stefan Schloesser wrote: >> >> Hmm, that wasn't as helpful as I'd hoped. >> Could you create a bug and attach a crm_report covering the testcase >> please? >> > [>] > > You want a bug report in the Fedora system or with Ubuntu 12.04 ? > If Fedora, how do I relate u

Re: [Linux-HA] Monitor operations not running after maintenance-mode=false ??

2012-09-18 Thread Andrew Beekhof
On Wed, Sep 12, 2012 at 9:23 PM, Stefan Schloesser wrote: >> Hmmm. So would I. >> Can you post cibadmin -Ql when the cluster is in this state? > [>] > > Hi, > > see attached file. I delete a couple of resources and restarted the cluster > before taking this log to simplify things. > The behaviour

Re: [Linux-HA] Monitor operations not running after maintenance-mode=false ??

2012-09-12 Thread Andrew Beekhof
On Wed, Sep 12, 2012 at 5:58 PM, Stefan Schloesser wrote: >> > crm configure property maintenance-mode=true crm configure property >> > maintenance-mode=false >> > >> >> No bug. This is what maintenance-mode is supposed to do. >> What are you trying to achieve? > [>] > > Hi Andrew, > > my original

Re: [Linux-HA] Monitor operations not running after maintenance-mode=false ??

2012-09-12 Thread Andrew Beekhof
On Wed, Sep 12, 2012 at 4:23 PM, Stefan Schloesser wrote: > Hi, > > I have a mysql resource which is monitored every 30sec using the ocf ra. > After running > crm configure property maintenance-mode=true > crm configure property maintenance-mode=false > > I can see in the log that no monitor opera

Re: [Linux-HA] Antw: Duplicate monitor operation on a multi state resource

2012-09-04 Thread Andrew Beekhof
On Wed, Aug 22, 2012 at 6:35 PM, Lars Marowsky-Bree wrote: > On 2012-08-22T10:32:57, RaSca wrote: > >> Thank you Lars, >> In fact, this is what I've done and now everything is ok. But I want to >> understand one last thing: if the ID is calculated with the value of >> interval then why I don't ha

Re: [Linux-HA] Best way to know on which host a resource has failed and where it will be promoted

2012-08-30 Thread Andrew Beekhof
On Sun, Aug 26, 2012 at 7:08 PM, RaSca wrote: > Hi all, > I want to interact with the new master election. I don't know if I must > operate at a Resource Agent level or at cluster level, so I'm opened to > suggestions. > > Suppose I've got a multi state resource for which I have one master and > t

Re: [Linux-HA] pacemaker 1.1 corosync and ocfs2

2012-08-30 Thread Andrew Beekhof
On Thu, Aug 30, 2012 at 3:49 AM, Heitor Lessa wrote: > Hi, > Has anyone got success implementing Corosync 1.4 with pacemaker 1.1 and > OCFS2 in other distribution apart of Fedora? You'd want to add cman between corosync and pacemaker for this. See the "1.1-plugin" edition of "clusters from scra

Re: [Linux-HA] new 'recovery' ressource state ?

2012-08-30 Thread Andrew Beekhof
You can get the same behaviour by setting longer timeouts and having the RA not return until it decides one way or another that the resource is good or bad. The best way not to have pacemaker perform a premature failover, is to not tell us about failures until you're sure. On Wed, Aug 29, 2012 at

Re: [Linux-HA] IP Clone

2012-08-20 Thread Andrew Beekhof
On Tue, Aug 21, 2012 at 3:22 PM, Dimitri Maziuk wrote: > On 8/20/2012 7:32 PM, Andrew Beekhof wrote: >> On Tue, Aug 21, 2012 at 8:49 AM, Dimitri Maziuk >> wrote: > >>> You are not allowed to run the IP address on two servers at once, full >>> stop

Re: [Linux-HA] IP Clone

2012-08-20 Thread Andrew Beekhof
On Fri, Aug 17, 2012 at 12:09 PM, Yount, William D wrote: > I have two servers. I am using pacemaker/cman(corosync). I am trying to share > an IP address between them. I would like the IP address to run on both > servers at the same time. However, my testing has shown that the IP address > stay

Re: [Linux-HA] IP Clone

2012-08-20 Thread Andrew Beekhof
On Tue, Aug 21, 2012 at 8:49 AM, Dimitri Maziuk wrote: > On 08/20/2012 05:01 PM, Yount, William D wrote: >> I am trying to set up an Active/Active cluster. I have an > Active/Passive cluster up and running. > > I don't remember seeing a clear explanation of when, where, and why > you'd actually wa

Re: [Linux-HA] Question about pacemaker + heartbeat + postgres in active/passive configuration

2012-08-15 Thread Andrew Beekhof
ile, I now think it's time for me to move to Corosync. > > - Mike > > -Original Message- > From: linux-ha-boun...@lists.linux-ha.org > [mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Andrew Beekhof > Sent: Wednesday, August 15, 2012 7:40 PM > To: Gene

Re: [Linux-HA] Question about pacemaker + heartbeat + postgres in active/passive configuration

2012-08-15 Thread Andrew Beekhof
On Wed, Aug 15, 2012 at 10:25 PM, Renee Riffee wrote: > Hello everyone, > > Apologies if this is not the correct group for this question, but I am > seeking information on how to set up pacemaker with heartbeat and postgres in > an active/passive streaming (pg 9.1) configuration. I would prefer

Re: [Linux-HA] heartbeat and n-to1 clusters

2012-08-07 Thread Andrew Beekhof
On Tue, Aug 7, 2012 at 1:42 AM, Andy Furtado wrote: > Hello, > > > Is it possible to setup an n-to-1 cluster configuration and have heartbeat > manage a different VIP for each virtual pair. > The n-to-1 configuration would have a single slave node, able to take over > for any one of the failed N

Re: [Linux-HA] Heartbeat Error [Solved]

2012-08-05 Thread Andrew Beekhof
More recent versions will create the leaf directory for you when pacemaker starts. On Fri, Aug 3, 2012 at 5:39 PM, Yount, William D wrote: > I was able to fix the error by creating the directory manually. > /var/lib/heartbeat/cores was already there, I just added root. > > Kind of an odd problem

Re: [Linux-HA] Heartbeat Error

2012-08-05 Thread Andrew Beekhof
On Fri, Aug 3, 2012 at 5:18 PM, Yount, William D wrote: > I am using pacemaker and corosync. For some reason I keep getting this error > in my messages log: > > ERROR: Cannot chdir to [/var/lib/heartbeat/cores/root]: No such file or > directory > > Should I not worry about that since I am using

Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ?

2012-08-01 Thread Andrew Beekhof
5" , so if the crm > delivered with Pacemaker > on RH does not support this, It was there but broken > that means that there was another way to > enter this type of configuration ? > Or perhaps that nobody has needed yet this type of sets ordering ? (Which > would b

Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ?

2012-07-31 Thread Andrew Beekhof
ant information about the crm shell" > > Thanks > Alain > > > > De :Andrew Beekhof > A : General Linux-HA mailing list > Date : 30/07/2012 10:23 > Objet : Re: [Linux-HA] Antw: How to configure ordered sets of unordered > resources as described in

Re: [Linux-HA] HA iSCSI storage problem

2012-07-30 Thread Andrew Beekhof
On Mon, Jul 30, 2012 at 5:32 PM, Bruno MACADRE wrote: > Le 30/07/2012 04:29, Andrew Beekhof a écrit : >> On Mon, Jul 23, 2012 at 11:56 PM, Bruno MACADRE >> wrote: >>> Hi, >>> >>> I'm working on a 2-node active/passive iSCSI storage clu

Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ?

2012-07-30 Thread Andrew Beekhof
bs/pacemaker github page. > > Thanks again > Alain > > > > De :Andrew Beekhof > A : General Linux-HA mailing list > Date : 30/07/2012 08:30 > Objet : Re: [Linux-HA] Antw: How to configure ordered sets of unordered > resources as described in Pacemaker do

Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ?

2012-07-29 Thread Andrew Beekhof
On Mon, Jul 30, 2012 at 3:58 PM, wrote: > Hi Andrew > sorry but I don't understand what you mean by "the stand-alone version of > the shell" ? The shell is now a separate project. > Thanks > Alain > > > > De :Andrew Beekhof > A : Genera

Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ?

2012-07-29 Thread Andrew Beekhof
On Fri, Jul 27, 2012 at 10:02 PM, wrote: > I've found in the mailing-list messages the syntax I could have written > with crm configure edit , something like : > order order-g-FS inf: ( fs-A fs-B fs-C fs-D fs-E ) ( exportfs-fs-A > exportfs-fs-B exportfs-fs-C exportfs-fs-D exportfs-fs-E ) > ri

Re: [Linux-HA] HA iSCSI storage problem

2012-07-29 Thread Andrew Beekhof
On Mon, Jul 23, 2012 at 11:56 PM, Bruno MACADRE wrote: > Hi, > > I'm working on a 2-node active/passive iSCSI storage cluster. > > (I follow the guide from linbit) > > All works fine, except when active server goes offline, in this case > the iSCSI target (ocf:heartbeat:iSC

Re: [Linux-HA] Heartbeat isn't switching to the 2nd node when Httpd is down!

2012-07-29 Thread Andrew Beekhof
On Wed, Jul 25, 2012 at 1:01 AM, Aboubakr Seddik Ouahabi wrote: > Hey there, I've created a thread somewhere, but I guess this is the right > place to seek help for this, and here is my issue as stated there: > > > Ok guys, that was very much appreciated and I thank you again. For now, I > just wa

Re: [Linux-HA] Pacemaker from source

2012-07-29 Thread Andrew Beekhof
Possibly your version of autotools was too old. On Thu, Jul 26, 2012 at 2:50 AM, Heitor Lessa wrote: > > Worked ! > I tried to get version via github and another tarball and did not work > (latest code), so looking for another versions I found a source rpm and > worked fine. > Package - > htt

Re: [Linux-HA] lsb agents on fedora17

2012-07-10 Thread Andrew Beekhof
On Mon, Jul 9, 2012 at 7:33 PM, wrote: > How is it possible to start/stop services on fedora17 using pacemaker? That is a different question to the subject ;-) Systemd unit files are not LSB compliant, so you cannot use lsb::somename In 1.1.8 (due in the next month or so) I have added native su

Re: [Linux-HA] stonith:external/ipmi was WARNING: Resources violate uniqueness

2012-07-01 Thread Andrew Beekhof
On Sat, Jun 30, 2012 at 12:31 AM, Andreas Kurz wrote: > On 06/29/2012 03:53 PM, EXTERNAL Konold Martin (erfrakon, RtP2/TEF72) wrote: >> Hi Andreas, >> >> thank you very much. Stonith works nicely when doing the 'kill -9 corosync' >> tests. >> >> When looking at /var/log/messages I can see entrie

Re: [Linux-HA] I need to edit my cib.xml manually

2012-06-25 Thread Andrew Beekhof
efore starting Pacemaker (just like with the RH > Cluster Suite where you can put in place a pre-configured cluster.conf > with all infos/resources inside and then start the CS) Sure you can. > > Alain > > > > De :    Andrew Beekhof > A :     General Linux-HA ma

Re: [Linux-HA] I need to edit my cib.xml manually

2012-06-20 Thread Andrew Beekhof
effort. What was the parameter and why not modify it before the cluster was shutdown or after it came back up? > 5/and then start again Pacemaker on all nodes > and it seems to work fine. > (but for now,  I test with a two-nodes cluster only) > > Alain > > > > De :    Andrew

Re: [Linux-HA] I need to edit my cib.xml manually

2012-06-20 Thread Andrew Beekhof
On Wed, Jun 20, 2012 at 7:21 PM, wrote: > Effectively, it seems to work fine to remove all .sig, modify cib.xml and > start again Pacemaker ! That's really new for me who has looked, one year > ago, for > a way to configure Pacemaker from scratch, without starting it, and also > in certain cases

Re: [Linux-HA] Pacemaker/corosync ==> Pacemaker/cman (on RH 6.2)

2012-06-19 Thread Andrew Beekhof
we're planning to remove it in >> the next point release or so. >> the policy is to not ship things we're not supporting. > Yes I know, but if it is always delivered on 6.3, that will be sufficient > for me until I switch from stack option 2 to stack option 4 > &g

Re: [Linux-HA] What's the meaning of "... Failed application of an update diff"

2012-06-19 Thread Andrew Beekhof
On Tue, Jun 19, 2012 at 6:29 PM, Lars Marowsky-Bree wrote: > On 2012-06-19T08:38:11, alain.mou...@bull.net wrote: > >> So that means that my modifications by crm configure edit , even if they >> are correct (I've re-checked them) , >> have potentially corrupt the Pacemaker configuration ? > > No.

Re: [Linux-HA] Pacemaker/corosync ==> Pacemaker/cman (on RH 6.2)

2012-06-19 Thread Andrew Beekhof
e policy is to not ship things we're not supporting. > > Thanks > Alain > > > > De :    Andrew Beekhof > A :     General Linux-HA mailing list > Date :  18/06/2012 23:38 > Objet : Re: [Linux-HA] Pacemaker/corosync ==> Pacemaker/cman (on RH 6.2) > Envoyé pa

Re: [Linux-HA] Problem with master/slave migration on fedora17

2012-06-18 Thread Andrew Beekhof
Not enough information i'm afraid. We need more than descriptions of the events, can you run crm_report for the period covered by your test? On Mon, Jun 18, 2012 at 6:29 PM, wrote: > Environment > > fedora17+corosync-2.0.1-1.fc17.x86_64+pacemaker-1.1.7-2.fc17.x86_64 > > two node cluster: > > #co

Re: [Linux-HA] What's the meaning of "... Failed application of an update diff"

2012-06-18 Thread Andrew Beekhof
On Mon, Jun 18, 2012 at 11:38 PM, wrote: > Hi > > What's the meaning of such syslog messages : > 1340026364 2012 Jun 18 15:32:44 xna1 daemon notice cib [11129]: notice: > cib_process_diff: Diff 0.966.1 -> 0.966.2 not applied to 0.966.1: Failed > application of an update diff Essentially we're tr

Re: [Linux-HA] Pacemaker/corosync ==> Pacemaker/cman (on RH 6.2)

2012-06-18 Thread Andrew Beekhof
> Thanks a lot > Alain > > > > De :    Andrew Beekhof > A :     General Linux-HA mailing list > Date :  16/06/2012 12:25 > Objet : Re: [Linux-HA] Pacemaker/corosync ==> Pacemaker/cman (on RH 6.2) > Envoyé par :    linux-ha-boun...@lists.linux-ha.org > > > > On

Re: [Linux-HA] Pacemaker/corosync ==> Pacemaker/cman (on RH 6.2)

2012-06-16 Thread Andrew Beekhof
On Fri, Jun 15, 2012 at 10:06 PM, wrote: > Hi Andrew > > you recall me in an old thread here that effectively cman was not involved > > in option 4 : corosync + cpg + quorumd + mcp > whereas it is involved in option 3 : corosync + cpg + cman + mcp > but is seems that corosync is also used in both

Re: [Linux-HA] DRBD service

2012-06-14 Thread Andrew Beekhof
On Wed, Jun 13, 2012 at 9:10 AM, Yount, William D wrote: > I am not sure which list to send this to; DRBD, Pacemaker, Corosync, etc. But > I figured I would start here and let someone guide me to the correct group. > > I am trying to setup a DRBD Active/Active cluster for redundant storage. I >

Re: [Linux-HA] Resources of a group running on different hosts

2012-06-14 Thread Andrew Beekhof
On Mon, Jun 11, 2012 at 2:16 AM, Luca Meron wrote: > > > Hi. > I've created a 2 node active/passive cluster. The HA manages 17 resources, > among IP and other services. > But I've a problem when resource placement: after I add one of the latest > resource, it is started on node2 instead of node1

Re: [Linux-HA] Active/Active Cluster

2012-06-14 Thread Andrew Beekhof
Um, you appear to have cman and corosync as cloned resources. Thats really not a good idea. Have you seen the "clusters from scratch" document? That would be a good place to start. On Thu, Jun 14, 2012 at 8:16 AM, Yount, William D wrote: > I have two servers, 10.89.99.31(KNTCLFS001) and 10.89.99

Re: [Linux-HA] Pacemaker/corosync ==> Pacemaker/cman (on RH 6.2)

2012-06-14 Thread Andrew Beekhof
we can't use two rings as with corosync ? I believe its possible, I don't know the details though. > > except if we use bond IF ? > > Alain > > > > De :    Andrew Beekhof > A :     General Linux-HA mailing list > Date :  13/06/2012 03:13 > Objet : Re: [Linu

Re: [Linux-HA] Pacemaker/corosync ==> Pacemaker/cman (on RH 6.2)

2012-06-13 Thread Andrew Beekhof
Good point. My bad :) > and crm_mon -1 ;-) > http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_bringing_the_cluster_online_with_cman.html > > But ok it starts now, I 'll test this stack. > Thanks a lot Andrew. > Alain > > > > De :    An

Re: [Linux-HA] Pacemaker/corosync ==> Pacemaker/cman (on RH 6.2)

2012-06-12 Thread Andrew Beekhof
On Wed, Jun 13, 2012 at 12:41 AM, wrote: > Hi > > I tried to make a Pacemaker/cman stack working following the instructions > here : > (on Red-Hat 6.2) > > So I stopped corosync and Pacemaker > I remove the corosync.conf > (and there were no /etc/corosync/service/pcmk file) > > I've patched the c

Re: [Linux-HA] Resource too active / vsftpd ubuntu 10.04

2012-06-11 Thread Andrew Beekhof
On Mon, Jun 11, 2012 at 1:52 AM, Luca Meron wrote: > > Hi.I'm getting the error "Resource too active" on several standard ubuntu > 10.04 startup scripts, like nmbd, smbd, vsftpd and winbind.I didn't test with > ocf-tester but it looks very strange to me that they're not ocf standard!I Because t

Re: [Linux-HA] Active/Active Cluster

2012-06-11 Thread Andrew Beekhof
On Sat, Jun 9, 2012 at 9:51 PM, emmanuel segura wrote: > Why you are using cman & corosync together? > > I think you should use cman+pacemaker or corosync+pacemaker Right. cman uses corosync underneath, but you should only configure+start one of them. Probably cman in this case. > > > > 2012/6/9

Re: [Linux-HA] Strange Pacemaker issue

2012-06-11 Thread Andrew Beekhof
Software versions? On Thu, Jun 7, 2012 at 4:53 AM, Yves Trudeau wrote: > Hi Florian > >> >> "corosync-cfgtool -s" is identical on all nodes? > > Yes, of course node ID are different and the id correspond to the IP of > the local NIC. > >> "corosync-objctl | grep member" produces 5 members on all

Re: [Linux-HA] bug in fence_virsh?

2012-06-11 Thread Andrew Beekhof
On Fri, Jun 8, 2012 at 5:40 PM, Léon Keijser wrote: > On Thu, 2012-06-07 at 10:37 +1000, Andrew Beekhof wrote: >> > Now according to the fence_virsh ra info, the param 'port' should >> > indicate the name of the guest on the hypervisor. >> >> IIRC w

Re: [Linux-HA] bug in fence_virsh?

2012-06-06 Thread Andrew Beekhof
On Thu, Jun 7, 2012 at 4:50 AM, Léon Keijser wrote: > Hi, > > For a simple demonstration I've set up a 2-node cluster (both kvm > virtuals) and configured stonith to interact with the kvm hypervisor. My > config: > > [root@node2 ~]# crm configure show > node node1.testnet.lan > node node2.testnet.

Re: [Linux-HA] Question about stacks .

2012-06-03 Thread Andrew Beekhof
On Fri, Jun 1, 2012 at 9:10 PM, wrote: > Hi > > I'm a little bit confused about stack choices : > > On this page we can see that there were 4 Corosync-based options : > http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for > > So, starting with 1

Re: [Linux-HA] corosync/pacemaker cluster failed

2012-05-24 Thread Andrew Beekhof
On Fri, May 25, 2012 at 10:39 AM, Tracy Reed wrote: > On Fri, May 25, 2012 at 02:01:18AM +0200, Lars Ellenberg spake thusly: >> Something is broken with your IPaddr2 script. >> Relevant package would be resource-agents. >> >> I suggest you simply >>  wget >> https://raw.github.com/ClusterLabs/res

Re: [Linux-HA] Can /var/lib/pengine files be deleted at boot?

2012-05-15 Thread Andrew Beekhof
On Wed, May 16, 2012 at 3:17 AM, William Seligman wrote: > I've had some problems with my Linux pacemaker cluster recently. I traced the > problem to what I believe is incorrect state information that was saved in > directory /var/lib/pengine. Nope. /var/lib/pengine is a record of what the CIB l

Re: [Linux-HA] Pacemaker monitor

2012-05-09 Thread Andrew Beekhof
On Thu, Apr 26, 2012 at 1:17 PM, dong he wrote: > Hi, >      recently I'm clustering the OpenSIPS with two Ubuntu computers. > I did it step by step and used the tutorial : > http://anders.com/cms/259/Linux.Tutorial/OpenSer/Heartbeat.v2.0 > But unfortunately I still met so many problems. > > The f

Re: [Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted?

2012-05-09 Thread Andrew Beekhof
On Tue, May 8, 2012 at 4:27 AM, Robinson, Eric wrote: > Hi guys, we rebooted a standby node of a healthy cluster and suddenly all the > resources on the primary cluster restarted. What's up with that? Before > rebooting the standby node, we did the normal stuff to verify that all was > well. >

Re: [Linux-HA] HA samba?

2012-04-26 Thread Andrew Beekhof
On Thu, Apr 26, 2012 at 8:38 AM, Serge Dubrouski wrote: > On Wed, Apr 25, 2012 at 4:28 PM, Seth Galitzer wrote: > >> On 04/25/2012 05:12 PM, Dimitri Maziuk wrote: >> > On 04/25/2012 03:53 PM, Seth Galitzer wrote: >> >> Can anybody point me to recent docs on how to go about setting this up? >> >>

Re: [Linux-HA] IPaddr stop is broken

2012-04-16 Thread Andrew Beekhof
On Mon, Apr 16, 2012 at 10:19 PM, Dejan Muhamedagic wrote: > Hi Andrew, > > On Fri, Apr 13, 2012 at 03:50:21PM +1000, Andrew Beekhof wrote: >> Looks like someone forgot to strip off the trailing colon from the ifname >> >> +++ find_interface_generic 192.168.122.110 &g

Re: [Linux-HA] problem with pind

2012-04-15 Thread Andrew Beekhof
On Fri, Apr 13, 2012 at 8:51 PM, S, MOHAMED (MOHAMED)** CTR ** wrote: > Hi, > > The Pacemaker_Explained.pdf document says that > > " setting of migration-threshold=2 and failure-timeout=60s would cause the > resource to move to a new node after 2 failures, and allow it to move back > (depending

[Linux-HA] IPaddr stop is broken

2012-04-12 Thread Andrew Beekhof
Looks like someone forgot to strip off the trailing colon from the ifname +++ find_interface_generic 192.168.122.110 +++ ipaddr=192.168.122.110 +++ read ifname linkstuff +++ ifconfig +++ : Read gave us ifname = eth0: +++ read inet addr junk +++ : Read gave us inet = inet addr = 192.168.122.103 +++

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-04-12 Thread Andrew Beekhof
On Fri, Apr 13, 2012 at 11:47 AM, Andrew Beekhof wrote: > On Thu, Apr 12, 2012 at 5:26 PM, Lars Ellenberg > wrote: >> On Wed, Apr 11, 2012 at 08:22:59AM +1000, Andrew Beekhof wrote: >>> It looks like the drbd RA is calling crm_master during the monitor action. >>>

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-04-12 Thread Andrew Beekhof
On Thu, Apr 12, 2012 at 5:26 PM, Lars Ellenberg wrote: > On Wed, Apr 11, 2012 at 08:22:59AM +1000, Andrew Beekhof wrote: >> It looks like the drbd RA is calling crm_master during the monitor action. >> That wouldn't seem like a good idea as the value isn't counted until &

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-04-10 Thread Andrew Beekhof
#x27;t change). Has the drbd RA always done this? On Sat, Mar 31, 2012 at 2:56 AM, William Seligman wrote: > On 3/30/12 1:13 AM, Andrew Beekhof wrote: >> On Fri, Mar 30, 2012 at 2:57 AM, William Seligman >> wrote: >>> On 3/29/12 3:19 AM, Andrew Beekhof wrote: >>&g

Re: [Linux-HA] Antw: Re: ERROR: do_recover: Action A_RECOVER (0000000001000000) not supported

2012-04-02 Thread Andrew Beekhof
On Mon, Apr 2, 2012 at 5:06 PM, Ulrich Windl wrote: >>>> Andrew Beekhof schrieb am 30.03.2012 um 00:57 in >>>> Nachricht > : >> On Thu, Mar 29, 2012 at 8:31 PM, Ulrich Windl >> wrote: >> > Hi! >> > >> > We had a problem wh

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-03-29 Thread Andrew Beekhof
On Fri, Mar 30, 2012 at 2:57 AM, William Seligman wrote: > On 3/29/12 3:19 AM, Andrew Beekhof wrote: >> On Wed, Mar 28, 2012 at 9:12 AM, William Seligman >> wrote: >>> The basics: Dual-primary cman+pacemaker+drbd cluster running on RHEL6.2; >>> spec >>>

Re: [Linux-HA] ERROR: do_recover: Action A_RECOVER (0000000001000000) not supported

2012-03-29 Thread Andrew Beekhof
On Thu, Mar 29, 2012 at 8:31 PM, Ulrich Windl wrote: > Hi! > > We had a problem when crmd crashed. Obviously, crmd after being restarted > tried to recover, but it seems recovery is not implemented yet: Recovery is implemented, just not graceful recovery without a restart of the process. Which

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-03-29 Thread Andrew Beekhof
On Wed, Mar 28, 2012 at 9:12 AM, William Seligman wrote: > The basics: Dual-primary cman+pacemaker+drbd cluster running on RHEL6.2; spec > files and versions below. > > Problem: If I restart both nodes at the same time, or even just start > pacemaker > on both nodes at the same time, the drbd ms

Re: [Linux-HA] High Performance High Availability Guide: new community documentation project

2012-03-26 Thread Andrew Beekhof
On Mon, Mar 26, 2012 at 6:27 PM, Florian Haas wrote: > On Mon, Mar 26, 2012 at 1:52 AM, Andrew Beekhof wrote: >> On Fri, Mar 23, 2012 at 10:39 PM, Florian Haas wrote: >>> Hi everyone, >>> >>> for those interested in contributing to a community documentation

Re: [Linux-HA] ocf:pacemaker:ClusterMon does not stop to send mails

2012-03-25 Thread Andrew Beekhof
On Fri, Mar 23, 2012 at 4:21 AM, Christoph Bartoschek wrote: > Am 22.03.2012 10:30, schrieb Dan Frincu: >> My guess is that the default recheck interval is 15 minutes and that >> also triggers ClusterMon to send an email at that interval. >> >> IIRC, ClusterMon sends an email per each event (someo

Re: [Linux-HA] ocf:pacemaker:ClusterMon does not stop to send mails

2012-03-25 Thread Andrew Beekhof
On Thu, Mar 22, 2012 at 8:30 PM, Dan Frincu wrote: > Hi, > > On Wed, Mar 21, 2012 at 7:35 PM, Christoph Bartoschek > wrote: >> Hi, >> >> after the incident yesterday we got everything up again. However since >> then ocf:pacemaker:ClusterMon sends a mail every 15 minutes although >> everything is

Re: [Linux-HA] High Performance High Availability Guide: new community documentation project

2012-03-25 Thread Andrew Beekhof
On Fri, Mar 23, 2012 at 10:39 PM, Florian Haas wrote: > Hi everyone, > > for those interested in contributing to a community documentation > project focusing on performance optimization in high availability > clusters, please take a look at the following URLs: > > https://github.com/fghaas/hp-ha-g

Re: [Linux-HA] order transitivity (was Re: order troubles)

2012-03-23 Thread Andrew Beekhof
On Fri, Mar 23, 2012 at 10:30 AM, William Seligman wrote: > On 3/22/12 10:06 AM, Florian Haas wrote: >> On Thu, Mar 22, 2012 at 10:34 AM, Lars Ellenberg >> wrote: order o_nfs_before_vz 0: cl_fs_nfs cl_vz order o_vz_before_ve992 0: cl_vz ve992 >>> >>> a score of "0" is roughly equivalent

Re: [Linux-HA] Apparent problem in pacemaker ordering

2012-03-19 Thread Andrew Beekhof
On Tue, Mar 6, 2012 at 3:53 AM, Florian Haas wrote: > On Sat, Mar 3, 2012 at 8:14 PM, Florian Haas wrote: >> In other words, interleave=true is actually the reasonable thing to >> set on all clone instances by default, and I believe the pengine >> actually does use a default of interleave=true on

Re: [Linux-HA] fence_nut fencing agent - use NUT to fence via UPS

2012-03-01 Thread Andrew Beekhof
On Fri, Mar 2, 2012 at 9:37 AM, William Seligman wrote: > After days spent debugging a fencing issue with my cluster, I know for certain > that this fencing agent works, at least for me. I'd like to contribute it to > the > Linux HA community. > > In my cluster, the fencing mechanism is to use NU

Re: [Linux-HA] cman+pacemaker+drbd fencing problem

2012-02-28 Thread Andrew Beekhof
On Wed, Feb 29, 2012 at 5:21 AM, William Seligman wrote: > On 2/27/12 8:40 PM, Andrew Beekhof wrote: > >> Oh, what does the fence_pcmk file look like? > > This is a standard part of the pacemaker-1.1.6 package. I know, I wrote it :-) I'm just curious exactly what it conta

Re: [Linux-HA] cman+pacemaker+drbd fencing problem

2012-02-27 Thread Andrew Beekhof
Oh, what does the fence_pcmk file look like? On Tue, Feb 28, 2012 at 12:40 PM, Andrew Beekhof wrote: > On Tue, Feb 28, 2012 at 11:49 AM, William Seligman > wrote: >> I'm trying to set up an active/active HA cluster as explained in Clusters >> From >> Scratch (which

Re: [Linux-HA] cman+pacemaker+drbd fencing problem

2012-02-27 Thread Andrew Beekhof
On Tue, Feb 28, 2012 at 11:49 AM, William Seligman wrote: > I'm trying to set up an active/active HA cluster as explained in Clusters From > Scratch (which I just re-read after my last problem). > > I'll give versions and config files below, but I'll start with what happens. I > start with an acti

Re: [Linux-HA] Understanding the behavior of IPaddr2 clone

2012-02-24 Thread Andrew Beekhof
On Sat, Feb 25, 2012 at 6:39 AM, William Seligman wrote: >> At this point, it looks my notion of re-writing IPaddr2 won't work. I'm >> redesigning my cluster configuration so I don't require >> cloned/highly-available >> IP addresses. >> >> Is this a bug? Is there a bugzilla or similar resource

Re: [Linux-HA] Writing a stonith-ng fencing agent in perl

2012-02-23 Thread Andrew Beekhof
On Fri, Feb 24, 2012 at 9:56 AM, William Seligman wrote: > The real reason the perl-scripted fencing agents don't give the correct > response > to stonith-admin is that they're looking for a "action=XXX" parameter from > stdin, when the actual parameter being passed is "option=XXX". I looked up

Re: [Linux-HA] Writing a stonith-ng fencing agent in perl

2012-02-23 Thread Andrew Beekhof
On Thu, Feb 23, 2012 at 11:08 AM, William Seligman wrote: > On 2/22/12 6:20 PM, Andrew Beekhof wrote: >> On Thu, Feb 23, 2012 at 8:21 AM, William Seligman >> wrote: >>> About a 1.5 years ago, I wrote a fencing agent for Pacemaker 1.0.x; it used >>> NU

Re: [Linux-HA] Writing a stonith-ng fencing agent in perl

2012-02-22 Thread Andrew Beekhof
On Thu, Feb 23, 2012 at 8:21 AM, William Seligman wrote: > About a 1.5 years ago, I wrote a fencing agent for Pacemaker 1.0.x; it used > NUT > to shut down power on a UPS: > > > > I'm building a new HA cluster using: > > Sc

Re: [Linux-HA] MMM conflict with Pacemaker

2012-02-17 Thread Andrew Beekhof
On Fri, Feb 17, 2012 at 4:00 AM, Mark Grennan wrote: > Hi Marcus, > > One Issue I can think of is, Pacemaker wants to bind the floating IP as > eth#:#, while MMM wants to use a different method that can only be seen with > the IP command.   I think they are fighting over who owns the floating IP

Re: [Linux-HA] Understanding the behavior of IPaddr2 clone

2012-02-16 Thread Andrew Beekhof
On Fri, Feb 17, 2012 at 5:05 AM, Dejan Muhamedagic wrote: > Hi, > > On Wed, Feb 15, 2012 at 04:24:15PM -0500, William Seligman wrote: >> On 2/10/12 4:53 PM, William Seligman wrote: >> > I'm trying to set up an Active/Active cluster (yes, I hear the sounds of >> > kittens >> > dying). Versions: >>

Re: [Linux-HA] pacemaker/corosync - cl_status . REASON: hb_api_signon: Can't initiate connection to heartbeat

2012-02-16 Thread Andrew Beekhof
gards, > Thomas. > > -Ursprüngliche Nachricht- > Von: linux-ha-boun...@lists.linux-ha.org > [mailto:linux-ha-boun...@lists.linux-ha.org] Im Auftrag von Andrew Beekhof > Gesendet: Mittwoch, 15. Februar 2012 11:13 > An: General Linux-HA mailing list > Betreff: Re: [L

Re: [Linux-HA] pacemaker/corosync - cl_status . REASON: hb_api_signon: Can't initiate connection to heartbeat

2012-02-15 Thread Andrew Beekhof
On Wed, Feb 15, 2012 at 5:50 PM, Florian Haas wrote: > On 02/14/12 03:09, Andrew Beekhof wrote: >> On Tue, Feb 14, 2012 at 7:26 AM, Thomas Baumann wrote: >>> Hello list, >>> >>> In my current pacemaker/corosync installation in a 2 node cluster I get >

Re: [Linux-HA] pacemaker/corosync - cl_status . REASON: hb_api_signon: Can't initiate connection to heartbeat

2012-02-13 Thread Andrew Beekhof
On Tue, Feb 14, 2012 at 7:26 AM, Thomas Baumann wrote: > Hello list, > > In my current pacemaker/corosync installation in a 2 node cluster I get > following error: > > # cl_status listnodes This is a heartbeat command, you're running corosync Try crm_node -p > > cl_status[3681]: 2012/02/13_21:18

Re: [Linux-HA] Status about ocfs2.pcmk ?

2012-02-05 Thread Andrew Beekhof
On Fri, Feb 3, 2012 at 7:29 PM, wrote: > Hi Andreas , > thanks for your response, but two questions : > 1/ why going with GFS2 ? because you know that ocfs2+pacemaker still does > not >    work fine on rhel ? or ... ? I wouldn't hold my breath waiting for OCFS2 to be supported on RHEL. The only

Re: [Linux-HA] FW: How DC is selected?

2012-02-05 Thread Andrew Beekhof
The location of the DC is an internal detail which you shouldn't care about. Why do you want to be able to predetermine its location? On Sun, Feb 5, 2012 at 1:53 PM, Mayank wrote: > Hello all, > >   I'm using pacemaker to manage our some of resources including Virtual IP. > It is working well in

Re: [Linux-HA] RA timeouts / Remote node did not respond

2012-01-30 Thread Andrew Beekhof
On Mon, Jan 30, 2012 at 9:28 PM, Sascha Reimann wrote: > Hi Dejan, > > thanks for hints! > > It's indeed a bigger cluster with currently 9 nodes and approximately 50 > resources, but we've planned to build an even bigger one, which will > probably not possible due to the frequent updates. The time

Re: [Linux-HA] crmsh property management regression

2012-01-16 Thread Andrew Beekhof
On Tue, Jan 17, 2012 at 4:56 AM, Dejan Muhamedagic wrote: > On Mon, Jan 16, 2012 at 06:47:54PM +0300, Vladislav Bogdanov wrote: >> Hi Dejan, >> >> thank you very much for a good pointer, you saved me much time. >> >> 16.01.2012 16:20, Dejan Muhamedagic wrote: >> > Hi Vladislav, >> > >> > On Mon, J

Re: [Linux-HA] The active trap of the SNMP is delayed.

2012-01-10 Thread Andrew Beekhof
Mon, 2011/12/5, Gao,Yan wrote: > >> Hi Andrew, >> >> On 11/30/11 19:01, Gao,Yan wrote: >> > Hi Andrew, >> > >> > On 11/28/11 07:53, Andrew Beekhof wrote: >> >> On Thu, Nov 24, 2011 at 7:50 PM, Gao,Yan wrote: >> >>> Hi Hideo, &

Re: [Linux-HA] Light Weight Quorum Arbitration

2012-01-04 Thread Andrew Beekhof
Hi Tanja, Is $(DTDROOT)/make/common.mk available somewhere so we can try building these? On Thu, Jan 5, 2012 at 2:31 AM, Tanja Roth wrote: > Hi, > > On 2011-12-06 10:33 Florian Haas wrote: >>On Tue, Dec 6, 2011 at 9:50 AM, Lars Marowsky-Bree >>wrote: >>> On 2011-12-04T00:57:05, Andreas Kurz w

Re: [Linux-HA] Question or problem around migration

2011-12-20 Thread Andrew Beekhof
On Wed, Dec 21, 2011 at 12:50 AM, Dan Frincu wrote: > Hi, > > On Tue, Dec 20, 2011 at 3:35 PM,   wrote: >> Ooops, sorry, the behavior is not the same, you were true : >> with cluster-recheck-interval="90" >> crm resource migrate group1 node2 P300S >> migration is quite immediately effective >> the

Re: [Linux-HA] Antw: Re: Q on http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active

2011-12-18 Thread Andrew Beekhof
On Fri, Dec 16, 2011 at 11:34 PM, Ulrich Windl wrote: Dominik Klein schrieb am 16.12.2011 um 12:34 in > Nachricht <4eeb2cdb.6020...@googlemail.com>: > >> >> On 12/15/2011 11:19 AM, Ulrich Windl wrote: >> > Hi! >> > >> > I have a problem with some client-server software (I don't want to

Re: [Linux-HA] resource unmanaged/failed

2011-12-11 Thread Andrew Beekhof
On Fri, Dec 9, 2011 at 7:46 PM, Aleksey V. Kashin wrote: >> How much do they have now? > > They have 12G RAM. That seems respectable. > >> How much is in use by the radius servers? > >                   total       used       free     shared    buffers     cached > Mem:         12038      11606

Re: [Linux-HA] resource unmanaged/failed

2011-12-08 Thread Andrew Beekhof
On Wed, Dec 7, 2011 at 9:56 PM, Aleksey V. Kashin wrote: > I can't increase ram on this servers. How can I do that resource isn't > becomes "unmanaged/failed" ? > How much do they have now? How much is in use by the radius servers? ___ Linux-HA mailing

Re: [Linux-HA] Pacemaker : Pb on stop on a resource while the monitoring is performed

2011-12-08 Thread Andrew Beekhof
On Tue, Nov 29, 2011 at 12:54 AM, wrote: > Hi > > I always have this problem. > Just a little question : when this occurs, meaning a monitoring happening > whereas there is > just a crm command request on the resource i.e. migration, why not just > return SUCCESS > so that the next monitoring on

Re: [Linux-HA] About user and passwd encoding in cib

2011-12-08 Thread Andrew Beekhof
Fencing agents in RHEL already support the password-script parameter (which could conceivably even query ldap or a database). What is your use-case? On Thu, Dec 8, 2011 at 11:41 PM, wrote: > Hi Dejan, > > ok but I remember that on RHEL, Red-Hat removes several things in > cluster-glue rpm, do yo

Re: [Linux-HA] Antw: What about "start-delay" attribute status ?

2011-12-06 Thread Andrew Beekhof
On Tue, Nov 29, 2011 at 12:40 AM, Dejan Muhamedagic wrote: > On Mon, Nov 28, 2011 at 10:36:06AM +1100, Andrew Beekhof wrote: >> On Thu, Nov 24, 2011 at 8:52 PM, Dejan Muhamedagic >> wrote: >> > Hi, >> > >> > On Wed, Nov 23, 2011 at 08:52:43AM +1100,

Re: [Linux-HA] Pacemaker : how to modify configuration ?

2011-11-28 Thread Andrew Beekhof
You could probably do something with cibadmin, grep and sed. On Tue, Nov 29, 2011 at 1:04 AM, wrote: > Hi > > sorry but I forgot if there is another way than "crm configure edit" to > modify > all the value of on-fail="" for all resources in the configuration ? > > Thanks > Alain > _

Re: [Linux-HA] Antw: Re: Q: unmanaged MD-RAID & auto-recovery

2011-11-28 Thread Andrew Beekhof
On Tue, Nov 29, 2011 at 7:54 AM, Dimitri Maziuk wrote: > On 11/28/2011 02:37 PM, Andrew Beekhof wrote: >> On Mon, Nov 28, 2011 at 7:16 PM, Ulrich Windl >> wrote: > >>> And therefore you need to monitor the _unmanaged_ resource? Strange. >> >> Now is the

<    1   2   3   4   5   6   7   8   9   10   >