Re: [Linux-HA] Monitor operations not running after maintenance-mode=false ??

2012-09-12 Thread Andrew Beekhof
On Wed, Sep 12, 2012 at 5:58 PM, Stefan Schloesser sschloes...@enomic.com wrote: crm configure property maintenance-mode=true crm configure property maintenance-mode=false No bug. This is what maintenance-mode is supposed to do. What are you trying to achieve? [] Hi Andrew, my

Re: [Linux-ha-dev] Slight bending of OCF specs: Re: Issues found in Apache resource agent

2012-09-05 Thread Andrew Beekhof
On 06/09/2012, at 12:30 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-09-05T15:25:44, Dejan Muhamedagic de...@suse.de wrote: How about a new element. Something like primitive vm1 ocf:heartbeat:VirtualDomain require vm1 web-test dns-test How we map this into Pacemaker's

Re: [Linux-HA] Antw: Duplicate monitor operation on a multi state resource

2012-09-04 Thread Andrew Beekhof
On Wed, Aug 22, 2012 at 6:35 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-08-22T10:32:57, RaSca ra...@miamammausalinux.org wrote: Thank you Lars, In fact, this is what I've done and now everything is ok. But I want to understand one last thing: if the ID is calculated with the value of

Re: [Linux-ha-dev] fence_legacy patch

2012-09-03 Thread Andrew Beekhof
Thanks Piotr, I've applied your patch and it will be in 1.1.8 Sorry for the delay. On Thu, Aug 23, 2012 at 12:04 AM, Chmylkowski, Piotr piotr.chmylkow...@atos.net wrote: Dear HA-dev I have implemented vcenter fencing, but the following patch was required to get it working. The problem was:

Re: [Linux-HA] new 'recovery' ressource state ?

2012-08-30 Thread Andrew Beekhof
You can get the same behaviour by setting longer timeouts and having the RA not return until it decides one way or another that the resource is good or bad. The best way not to have pacemaker perform a premature failover, is to not tell us about failures until you're sure. On Wed, Aug 29, 2012 at

Re: [Linux-HA] pacemaker 1.1 corosync and ocfs2

2012-08-30 Thread Andrew Beekhof
On Thu, Aug 30, 2012 at 3:49 AM, Heitor Lessa heitor.le...@hotmail.com wrote: Hi, Has anyone got success implementing Corosync 1.4 with pacemaker 1.1 and OCFS2 in other distribution apart of Fedora? You'd want to add cman between corosync and pacemaker for this. See the 1.1-plugin edition of

Re: [Linux-HA] Best way to know on which host a resource has failed and where it will be promoted

2012-08-30 Thread Andrew Beekhof
On Sun, Aug 26, 2012 at 7:08 PM, RaSca ra...@miamammausalinux.org wrote: Hi all, I want to interact with the new master election. I don't know if I must operate at a Resource Agent level or at cluster level, so I'm opened to suggestions. Suppose I've got a multi state resource for which I

Re: [Linux-HA] IP Clone

2012-08-21 Thread Andrew Beekhof
On Tue, Aug 21, 2012 at 3:22 PM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: On 8/20/2012 7:32 PM, Andrew Beekhof wrote: On Tue, Aug 21, 2012 at 8:49 AM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: You are not allowed to run the IP address on two servers at once, full stop. Complain to Rob

Re: [Linux-HA] IP Clone

2012-08-20 Thread Andrew Beekhof
On Tue, Aug 21, 2012 at 8:49 AM, Dimitri Maziuk dmaz...@bmrb.wisc.edu wrote: On 08/20/2012 05:01 PM, Yount, William D wrote: I am trying to set up an Active/Active cluster. I have an Active/Passive cluster up and running. I don't remember seeing a clear explanation of when, where, and why

Re: [Linux-HA] IP Clone

2012-08-20 Thread Andrew Beekhof
On Fri, Aug 17, 2012 at 12:09 PM, Yount, William D yount.will...@menloworldwide.com wrote: I have two servers. I am using pacemaker/cman(corosync). I am trying to share an IP address between them. I would like the IP address to run on both servers at the same time. However, my testing has

Re: [Linux-ha-dev] apply_xml_diff: Digest mis-match

2012-08-16 Thread Andrew Beekhof
On Mon, Aug 13, 2012 at 11:39 PM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: Hi! In pacemaker-1.1.6-1.29.1 (SLES11 SP2 x86_64) I see this for an idle cluster with just one stonith resource being running when doing some unrelated change: Aug 13 15:33:19 h3 cib: [31938]: info:

Re: [Linux-HA] Question about pacemaker + heartbeat + postgres in active/passive configuration

2012-08-15 Thread Andrew Beekhof
On Wed, Aug 15, 2012 at 10:25 PM, Renee Riffee riffe...@gmail.com wrote: Hello everyone, Apologies if this is not the correct group for this question, but I am seeking information on how to set up pacemaker with heartbeat and postgres in an active/passive streaming (pg 9.1) configuration.

Re: [Linux-HA] Question about pacemaker + heartbeat + postgres in active/passive configuration

2012-08-15 Thread Andrew Beekhof
it's time for me to move to Corosync. - Mike -Original Message- From: linux-ha-boun...@lists.linux-ha.org [mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Andrew Beekhof Sent: Wednesday, August 15, 2012 7:40 PM To: General Linux-HA mailing list Subject: Re: [Linux-HA

Re: [Linux-HA] heartbeat and n-to1 clusters

2012-08-07 Thread Andrew Beekhof
On Tue, Aug 7, 2012 at 1:42 AM, Andy Furtado awf...@yahoo.com wrote: Hello, Is it possible to setup an n-to-1 cluster configuration and have heartbeat manage a different VIP for each virtual pair. The n-to-1 configuration would have a single slave node, able to take over for any one of

Re: [Linux-HA] Heartbeat Error

2012-08-05 Thread Andrew Beekhof
On Fri, Aug 3, 2012 at 5:18 PM, Yount, William D yount.will...@menloworldwide.com wrote: I am using pacemaker and corosync. For some reason I keep getting this error in my messages log: ERROR: Cannot chdir to [/var/lib/heartbeat/cores/root]: No such file or directory Should I not worry

Re: [Linux-HA] Heartbeat Error [Solved]

2012-08-05 Thread Andrew Beekhof
More recent versions will create the leaf directory for you when pacemaker starts. On Fri, Aug 3, 2012 at 5:39 PM, Yount, William D yount.will...@menloworldwide.com wrote: I was able to fix the error by creating the directory manually. /var/lib/heartbeat/cores was already there, I just added

Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ?

2012-08-01 Thread Andrew Beekhof
delivered with Pacemaker on RH does not support this, It was there but broken that means that there was another way to enter this type of configuration ? Or perhaps that nobody has needed yet this type of sets ordering ? (Which would be strange ...) Alain De :Andrew Beekhof

Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ?

2012-07-31 Thread Andrew Beekhof
information about the crm shell Thanks Alain De :Andrew Beekhof and...@beekhof.net A : General Linux-HA mailing list linux-ha@lists.linux-ha.org Date : 30/07/2012 10:23 Objet : Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc

Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ?

2012-07-30 Thread Andrew Beekhof
On Mon, Jul 30, 2012 at 3:58 PM, alain.mou...@bull.net wrote: Hi Andrew sorry but I don't understand what you mean by the stand-alone version of the shell ? The shell is now a separate project. Thanks Alain De :Andrew Beekhof and...@beekhof.net A : General Linux-HA mailing

Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ?

2012-07-30 Thread Andrew Beekhof
github page. Thanks again Alain De :Andrew Beekhof and...@beekhof.net A : General Linux-HA mailing list linux-ha@lists.linux-ha.org Date : 30/07/2012 08:30 Objet : Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ? Envoyé

Re: [Linux-HA] HA iSCSI storage problem

2012-07-30 Thread Andrew Beekhof
On Mon, Jul 30, 2012 at 5:32 PM, Bruno MACADRE bruno.maca...@univ-rouen.fr wrote: Le 30/07/2012 04:29, Andrew Beekhof a écrit : On Mon, Jul 23, 2012 at 11:56 PM, Bruno MACADRE bruno.maca...@univ-rouen.fr wrote: Hi, I'm working on a 2-node active/passive iSCSI storage cluster

Re: [Linux-HA] Pacemaker from source

2012-07-29 Thread Andrew Beekhof
Possibly your version of autotools was too old. On Thu, Jul 26, 2012 at 2:50 AM, Heitor Lessa heitor.le...@hotmail.com wrote: Worked ! I tried to get version via github and another tarball and did not work (latest code), so looking for another versions I found a source rpm and worked

Re: [Linux-HA] Heartbeat isn't switching to the 2nd node when Httpd is down!

2012-07-29 Thread Andrew Beekhof
On Wed, Jul 25, 2012 at 1:01 AM, Aboubakr Seddik Ouahabi ouaha...@gmail.com wrote: Hey there, I've created a thread somewhere, but I guess this is the right place to seek help for this, and here is my issue as stated there: Ok guys, that was very much appreciated and I thank you again. For

Re: [Linux-HA] HA iSCSI storage problem

2012-07-29 Thread Andrew Beekhof
On Mon, Jul 23, 2012 at 11:56 PM, Bruno MACADRE bruno.maca...@univ-rouen.fr wrote: Hi, I'm working on a 2-node active/passive iSCSI storage cluster. (I follow the guide from linbit) All works fine, except when active server goes offline, in this case the iSCSI

Re: [Linux-HA] Antw: How to configure ordered sets of unordered resources as described in Pacemaker doc ?

2012-07-29 Thread Andrew Beekhof
On Fri, Jul 27, 2012 at 10:02 PM, alain.mou...@bull.net wrote: I've found in the mailing-list messages the syntax I could have written with crm configure edit , something like : order order-g-FS inf: ( fs-A fs-B fs-C fs-D fs-E ) ( exportfs-fs-A exportfs-fs-B exportfs-fs-C exportfs-fs-D

Re: [Linux-HA] lsb agents on fedora17

2012-07-10 Thread Andrew Beekhof
On Mon, Jul 9, 2012 at 7:33 PM, ov...@qip.ru wrote: How is it possible to start/stop services on fedora17 using pacemaker? That is a different question to the subject ;-) Systemd unit files are not LSB compliant, so you cannot use lsb::somename In 1.1.8 (due in the next month or so) I have

Re: [Linux-HA] I need to edit my cib.xml manually

2012-06-25 Thread Andrew Beekhof
in place a pre-configured cluster.conf with all infos/resources inside and then start the CS) Sure you can. Alain De :    Andrew Beekhof and...@beekhof.net A :     General Linux-HA mailing list linux-ha@lists.linux-ha.org Date :  21/06/2012 03:56 Objet : Re: [Linux-HA] I need to edit my cib.xml

Re: [Linux-HA] I need to edit my cib.xml manually

2012-06-20 Thread Andrew Beekhof
On Wed, Jun 20, 2012 at 7:21 PM, alain.mou...@bull.net wrote: Effectively, it seems to work fine to remove all .sig, modify cib.xml and start again Pacemaker ! That's really new for me who has looked, one year ago, for a way to configure Pacemaker from scratch, without starting it, and also

Re: [Linux-HA] I need to edit my cib.xml manually

2012-06-20 Thread Andrew Beekhof
was the parameter and why not modify it before the cluster was shutdown or after it came back up? 5/and then start again Pacemaker on all nodes and it seems to work fine. (but for now,  I test with a two-nodes cluster only) Alain De :    Andrew Beekhof and...@beekhof.net A :     General

Re: [Linux-HA] Pacemaker/corosync == Pacemaker/cman (on RH 6.2)

2012-06-19 Thread Andrew Beekhof
not supporting. Thanks Alain De :    Andrew Beekhof and...@beekhof.net A :     General Linux-HA mailing list linux-ha@lists.linux-ha.org Date :  18/06/2012 23:38 Objet : Re: [Linux-HA] Pacemaker/corosync == Pacemaker/cman (on RH 6.2) Envoyé par :    linux-ha-boun...@lists.linux-ha.org

Re: [Linux-HA] What's the meaning of ... Failed application of an update diff

2012-06-19 Thread Andrew Beekhof
On Tue, Jun 19, 2012 at 6:29 PM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-06-19T08:38:11, alain.mou...@bull.net wrote: So that means that my modifications by crm configure edit , even if they are correct (I've re-checked them) , have potentially corrupt the Pacemaker configuration ?

Re: [Linux-HA] Pacemaker/corosync == Pacemaker/cman (on RH 6.2)

2012-06-19 Thread Andrew Beekhof
not supporting. Yes I know, but if it is always delivered on 6.3, that will be sufficient for me until I switch from stack option 2 to stack option 4 Alain De :    Andrew Beekhof and...@beekhof.net A :     General Linux-HA mailing list linux-ha@lists.linux-ha.org Date :  19/06/2012 11:34

Re: [Linux-HA] Pacemaker/corosync == Pacemaker/cman (on RH 6.2)

2012-06-18 Thread Andrew Beekhof
:    Andrew Beekhof and...@beekhof.net A :     General Linux-HA mailing list linux-ha@lists.linux-ha.org Date :  16/06/2012 12:25 Objet : Re: [Linux-HA] Pacemaker/corosync == Pacemaker/cman (on RH 6.2) Envoyé par :    linux-ha-boun...@lists.linux-ha.org On Fri, Jun 15, 2012 at 10:06 PM

Re: [Linux-HA] What's the meaning of ... Failed application of an update diff

2012-06-18 Thread Andrew Beekhof
On Mon, Jun 18, 2012 at 11:38 PM, alain.mou...@bull.net wrote: Hi What's the meaning of such syslog messages : 1340026364 2012 Jun 18 15:32:44 xna1 daemon notice cib [11129]: notice: cib_process_diff: Diff 0.966.1 - 0.966.2 not applied to 0.966.1: Failed application of an update diff

Re: [Linux-HA] Problem with master/slave migration on fedora17

2012-06-18 Thread Andrew Beekhof
Not enough information i'm afraid. We need more than descriptions of the events, can you run crm_report for the period covered by your test? On Mon, Jun 18, 2012 at 6:29 PM, ov...@qip.ru wrote: Environment fedora17+corosync-2.0.1-1.fc17.x86_64+pacemaker-1.1.7-2.fc17.x86_64 two node cluster:

Re: [Linux-HA] Pacemaker/corosync == Pacemaker/cman (on RH 6.2)

2012-06-16 Thread Andrew Beekhof
On Fri, Jun 15, 2012 at 10:06 PM, alain.mou...@bull.net wrote: Hi Andrew you recall me in an old thread here that effectively cman was not involved in option 4 : corosync + cpg + quorumd + mcp whereas it is involved in option 3 : corosync + cpg + cman + mcp but is seems that corosync is

Re: [Linux-HA] Pacemaker/corosync == Pacemaker/cman (on RH 6.2)

2012-06-14 Thread Andrew Beekhof
as with corosync ? I believe its possible, I don't know the details though. except if we use bond IF ? Alain De :    Andrew Beekhof and...@beekhof.net A :     General Linux-HA mailing list linux-ha@lists.linux-ha.org Date :  13/06/2012 03:13 Objet : Re: [Linux-HA] Pacemaker/corosync

Re: [Linux-HA] Active/Active Cluster

2012-06-14 Thread Andrew Beekhof
Um, you appear to have cman and corosync as cloned resources. Thats really not a good idea. Have you seen the clusters from scratch document? That would be a good place to start. On Thu, Jun 14, 2012 at 8:16 AM, Yount, William D yount.will...@menloworldwide.com wrote: I have two servers,

Re: [Linux-HA] Resources of a group running on different hosts

2012-06-14 Thread Andrew Beekhof
On Mon, Jun 11, 2012 at 2:16 AM, Luca Meron ckslxpc...@hotmail.it wrote: Hi. I've created a 2 node active/passive cluster. The HA manages 17 resources, among IP and other services. But I've a problem when resource placement: after I add one of the latest resource, it is started on node2

Re: [Linux-HA] DRBD service

2012-06-14 Thread Andrew Beekhof
On Wed, Jun 13, 2012 at 9:10 AM, Yount, William D yount.will...@menloworldwide.com wrote: I am not sure which list to send this to; DRBD, Pacemaker, Corosync, etc. But I figured I would start here and let someone guide me to the correct group. I am trying to setup a DRBD Active/Active cluster

Re: [Linux-HA] Pacemaker/corosync == Pacemaker/cman (on RH 6.2)

2012-06-13 Thread Andrew Beekhof
point. My bad :) and crm_mon -1 ;-) http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/_bringing_the_cluster_online_with_cman.html But ok it starts now, I 'll test this stack. Thanks a lot Andrew. Alain De :    Andrew Beekhof and...@beekhof.net A :     General

Re: [Linux-HA] Pacemaker/corosync == Pacemaker/cman (on RH 6.2)

2012-06-12 Thread Andrew Beekhof
On Wed, Jun 13, 2012 at 12:41 AM, alain.mou...@bull.net wrote: Hi I tried to make a Pacemaker/cman stack working following the instructions here : (on Red-Hat 6.2) So I stopped corosync and Pacemaker I remove the corosync.conf (and there were no /etc/corosync/service/pcmk file) I've

Re: [Linux-ha-dev] STONITH agent for SoftLayer API

2012-06-11 Thread Andrew Beekhof
On Fri, Jun 8, 2012 at 1:19 PM, Alan Robertson al...@unix.sh wrote: Red Hat invented their own API then disabled the working API in their version of the code. Of course, they don't have as many agents, and they're not as well tested Red Hat has had their own API for a very long time.

Re: [Linux-ha-dev] Pacemaker and conntrackd RA not obeying colocation constraint

2012-06-11 Thread Andrew Beekhof
On Thu, Jun 7, 2012 at 5:37 PM, aldo sarmiento sarmi...@gmail.com wrote: Hello, I'm having a problem getting conntrackd ms to work with a colocation constraint. I want to have conntrackd Master only on the node that has an IPaddr2 primitive running on it. Here are my specs: Ubuntu: 12.04

Re: [Linux-HA] bug in fence_virsh?

2012-06-11 Thread Andrew Beekhof
On Fri, Jun 8, 2012 at 5:40 PM, Léon Keijser keij...@stone-it.com wrote: On Thu, 2012-06-07 at 10:37 +1000, Andrew Beekhof wrote: Now according to the fence_virsh ra info, the param 'port' should indicate the name of the guest on the hypervisor. IIRC we try to work it out automatically

Re: [Linux-HA] Strange Pacemaker issue

2012-06-11 Thread Andrew Beekhof
Software versions? On Thu, Jun 7, 2012 at 4:53 AM, Yves Trudeau y.trud...@videotron.ca wrote: Hi Florian corosync-cfgtool -s is identical on all nodes? Yes, of course node ID are different and the id correspond to the IP of the local NIC. corosync-objctl | grep member produces 5 members

Re: [Linux-HA] Active/Active Cluster

2012-06-11 Thread Andrew Beekhof
On Sat, Jun 9, 2012 at 9:51 PM, emmanuel segura emi2f...@gmail.com wrote: Why you are using cman corosync together? I think you should use cman+pacemaker or corosync+pacemaker Right. cman uses corosync underneath, but you should only configure+start one of them. Probably cman in this case.

Re: [Linux-HA] Resource too active / vsftpd ubuntu 10.04

2012-06-11 Thread Andrew Beekhof
On Mon, Jun 11, 2012 at 1:52 AM, Luca Meron ckslxpc...@hotmail.it wrote: Hi.I'm getting the error Resource too active on several standard ubuntu 10.04 startup scripts, like nmbd, smbd, vsftpd and winbind.I didn't test with ocf-tester but it looks very strange to me that they're not ocf

Re: [Linux-HA] bug in fence_virsh?

2012-06-06 Thread Andrew Beekhof
On Thu, Jun 7, 2012 at 4:50 AM, Léon Keijser keij...@stone-it.com wrote: Hi, For a simple demonstration I've set up a 2-node cluster (both kvm virtuals) and configured stonith to interact with the kvm hypervisor. My config: [root@node2 ~]# crm configure show node node1.testnet.lan node

Re: [Linux-HA] Question about stacks .

2012-06-03 Thread Andrew Beekhof
On Fri, Jun 1, 2012 at 9:10 PM, alain.mou...@bull.net wrote: Hi I'm a little bit confused about stack choices : On this page we can see that there were 4 Corosync-based options : http://theclusterguy.clusterlabs.org/post/907043024/introducing-the-pacemaker-master-control-process-for So,

Re: [Linux-ha-dev] [rfc] SBD with Pacemaker/Quorum integration

2012-05-27 Thread Andrew Beekhof
On Sat, May 26, 2012 at 5:56 AM, Lars Marowsky-Bree l...@suse.com wrote: On 2012-05-25T21:44:25, Florian Haas flor...@hastexo.com wrote: If so, the master thread will not self-fence even if the majority of devices is currently unavailable. That's it, nothing more. Does that help? It

Re: [Linux-HA] corosync/pacemaker cluster failed

2012-05-24 Thread Andrew Beekhof
On Fri, May 25, 2012 at 10:39 AM, Tracy Reed tr...@ultraviolet.org wrote: On Fri, May 25, 2012 at 02:01:18AM +0200, Lars Ellenberg spake thusly: Something is broken with your IPaddr2 script. Relevant package would be resource-agents. I suggest you simply  wget

Re: [Linux-HA] Can /var/lib/pengine files be deleted at boot?

2012-05-15 Thread Andrew Beekhof
On Wed, May 16, 2012 at 3:17 AM, William Seligman selig...@nevis.columbia.edu wrote: I've had some problems with my Linux pacemaker cluster recently. I traced the problem to what I believe is incorrect state information that was saved in directory /var/lib/pengine. Nope. /var/lib/pengine is a

Re: [Linux-HA] We Rebooted a Healthy Standby Node and All the Services on the Primary Node Restarted?

2012-05-09 Thread Andrew Beekhof
On Tue, May 8, 2012 at 4:27 AM, Robinson, Eric eric.robin...@psmnv.com wrote: Hi guys, we rebooted a standby node of a healthy cluster and suddenly all the resources on the primary cluster restarted. What's up with that? Before rebooting the standby node, we did the normal stuff to verify

Re: [Linux-HA] Pacemaker monitor

2012-05-09 Thread Andrew Beekhof
On Thu, Apr 26, 2012 at 1:17 PM, dong he smiledon...@gmail.com wrote: Hi,      recently I'm clustering the OpenSIPS with two Ubuntu computers. I did it step by step and used the tutorial : http://anders.com/cms/259/Linux.Tutorial/OpenSer/Heartbeat.v2.0 But unfortunately I still met so many

Re: [Linux-ha-dev] heartbeat gmain source priority inversion with rexmit and dead node detection

2012-04-29 Thread Andrew Beekhof
On Sat, Apr 28, 2012 at 12:11 AM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Thu, Apr 26, 2012 at 10:56:30AM +0900, renayama19661...@ybb.ne.jp wrote: Hi All, We gave test that assumed remote cluster environment. And we tested packet lost. You may be interested in this patch I have

Re: [Linux-HA] HA samba?

2012-04-26 Thread Andrew Beekhof
On Thu, Apr 26, 2012 at 8:38 AM, Serge Dubrouski serge...@gmail.com wrote: On Wed, Apr 25, 2012 at 4:28 PM, Seth Galitzer sg...@ksu.edu wrote: On 04/25/2012 05:12 PM, Dimitri Maziuk wrote: On 04/25/2012 03:53 PM, Seth Galitzer wrote: Can anybody point me to recent docs on how to go about

Re: [Linux-HA] IPaddr stop is broken

2012-04-16 Thread Andrew Beekhof
On Mon, Apr 16, 2012 at 10:19 PM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi Andrew, On Fri, Apr 13, 2012 at 03:50:21PM +1000, Andrew Beekhof wrote: Looks like someone forgot to strip off the trailing colon from the ifname +++ find_interface_generic 192.168.122.110 +++ ipaddr

Re: [Linux-HA] problem with pind

2012-04-15 Thread Andrew Beekhof
On Fri, Apr 13, 2012 at 8:51 PM, S, MOHAMED (MOHAMED)** CTR ** mohame...@alcatel-lucent.com wrote: Hi, The Pacemaker_Explained.pdf document says that setting of migration-threshold=2 and failure-timeout=60s would cause the resource to move to a new node after 2 failures, and allow it to

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-04-12 Thread Andrew Beekhof
On Thu, Apr 12, 2012 at 5:26 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Wed, Apr 11, 2012 at 08:22:59AM +1000, Andrew Beekhof wrote: It looks like the drbd RA is calling crm_master during the monitor action. That wouldn't seem like a good idea as the value isn't counted until

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-04-12 Thread Andrew Beekhof
On Fri, Apr 13, 2012 at 11:47 AM, Andrew Beekhof and...@beekhof.net wrote: On Thu, Apr 12, 2012 at 5:26 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Wed, Apr 11, 2012 at 08:22:59AM +1000, Andrew Beekhof wrote: It looks like the drbd RA is calling crm_master during the monitor action

[Linux-HA] IPaddr stop is broken

2012-04-12 Thread Andrew Beekhof
Looks like someone forgot to strip off the trailing colon from the ifname +++ find_interface_generic 192.168.122.110 +++ ipaddr=192.168.122.110 +++ read ifname linkstuff +++ ifconfig +++ : Read gave us ifname = eth0: +++ read inet addr junk +++ : Read gave us inet = inet addr = 192.168.122.103

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-04-10 Thread Andrew Beekhof
the drbd RA always done this? On Sat, Mar 31, 2012 at 2:56 AM, William Seligman selig...@nevis.columbia.edu wrote: On 3/30/12 1:13 AM, Andrew Beekhof wrote: On Fri, Mar 30, 2012 at 2:57 AM, William Seligman selig...@nevis.columbia.edu wrote: On 3/29/12 3:19 AM, Andrew Beekhof wrote: On Wed, Mar

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-03-29 Thread Andrew Beekhof
On Wed, Mar 28, 2012 at 9:12 AM, William Seligman selig...@nevis.columbia.edu wrote: The basics: Dual-primary cman+pacemaker+drbd cluster running on RHEL6.2; spec files and versions below. Problem: If I restart both nodes at the same time, or even just start pacemaker on both nodes at the

Re: [Linux-HA] ERROR: do_recover: Action A_RECOVER (0000000001000000) not supported

2012-03-29 Thread Andrew Beekhof
On Thu, Mar 29, 2012 at 8:31 PM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: Hi! We had a problem when crmd crashed. Obviously, crmd after being restarted tried to recover, but it seems recovery is not implemented yet: Recovery is implemented, just not graceful recovery without a

Re: [Linux-HA] pacemaker+drbd promotion delay

2012-03-29 Thread Andrew Beekhof
On Fri, Mar 30, 2012 at 2:57 AM, William Seligman selig...@nevis.columbia.edu wrote: On 3/29/12 3:19 AM, Andrew Beekhof wrote: On Wed, Mar 28, 2012 at 9:12 AM, William Seligman selig...@nevis.columbia.edu wrote: The basics: Dual-primary cman+pacemaker+drbd cluster running on RHEL6.2; spec

Re: [Linux-HA] High Performance High Availability Guide: new community documentation project

2012-03-26 Thread Andrew Beekhof
On Mon, Mar 26, 2012 at 6:27 PM, Florian Haas flor...@hastexo.com wrote: On Mon, Mar 26, 2012 at 1:52 AM, Andrew Beekhof and...@beekhof.net wrote: On Fri, Mar 23, 2012 at 10:39 PM, Florian Haas flor...@hastexo.com wrote: Hi everyone, for those interested in contributing to a community

Re: [Linux-HA] High Performance High Availability Guide: new community documentation project

2012-03-25 Thread Andrew Beekhof
On Fri, Mar 23, 2012 at 10:39 PM, Florian Haas flor...@hastexo.com wrote: Hi everyone, for those interested in contributing to a community documentation project focusing on performance optimization in high availability clusters, please take a look at the following URLs:

Re: [Linux-HA] order transitivity (was Re: order troubles)

2012-03-23 Thread Andrew Beekhof
On Fri, Mar 23, 2012 at 10:30 AM, William Seligman selig...@nevis.columbia.edu wrote: On 3/22/12 10:06 AM, Florian Haas wrote: On Thu, Mar 22, 2012 at 10:34 AM, Lars Ellenberg lars.ellenb...@linbit.com wrote: order o_nfs_before_vz 0: cl_fs_nfs cl_vz order o_vz_before_ve992 0: cl_vz ve992 a

Re: [Linux-HA] Apparent problem in pacemaker ordering

2012-03-19 Thread Andrew Beekhof
On Tue, Mar 6, 2012 at 3:53 AM, Florian Haas flor...@hastexo.com wrote: On Sat, Mar 3, 2012 at 8:14 PM, Florian Haas flor...@hastexo.com wrote: In other words, interleave=true is actually the reasonable thing to set on all clone instances by default, and I believe the pengine actually does use

Re: [Linux-HA] fence_nut fencing agent - use NUT to fence via UPS

2012-03-01 Thread Andrew Beekhof
On Fri, Mar 2, 2012 at 9:37 AM, William Seligman selig...@nevis.columbia.edu wrote: After days spent debugging a fencing issue with my cluster, I know for certain that this fencing agent works, at least for me. I'd like to contribute it to the Linux HA community. In my cluster, the fencing

Re: [Linux-HA] cman+pacemaker+drbd fencing problem

2012-02-28 Thread Andrew Beekhof
On Wed, Feb 29, 2012 at 5:21 AM, William Seligman selig...@nevis.columbia.edu wrote: On 2/27/12 8:40 PM, Andrew Beekhof wrote: Oh, what does the fence_pcmk file look like? This is a standard part of the pacemaker-1.1.6 package. I know, I wrote it :-) I'm just curious exactly what it contains

Re: [Linux-HA] cman+pacemaker+drbd fencing problem

2012-02-27 Thread Andrew Beekhof
On Tue, Feb 28, 2012 at 11:49 AM, William Seligman selig...@nevis.columbia.edu wrote: I'm trying to set up an active/active HA cluster as explained in Clusters From Scratch (which I just re-read after my last problem). I'll give versions and config files below, but I'll start with what

Re: [Linux-HA] cman+pacemaker+drbd fencing problem

2012-02-27 Thread Andrew Beekhof
Oh, what does the fence_pcmk file look like? On Tue, Feb 28, 2012 at 12:40 PM, Andrew Beekhof and...@beekhof.net wrote: On Tue, Feb 28, 2012 at 11:49 AM, William Seligman selig...@nevis.columbia.edu wrote: I'm trying to set up an active/active HA cluster as explained in Clusters From

Re: [Linux-HA] Understanding the behavior of IPaddr2 clone

2012-02-24 Thread Andrew Beekhof
On Sat, Feb 25, 2012 at 6:39 AM, William Seligman selig...@nevis.columbia.edu wrote: At this point, it looks my notion of re-writing IPaddr2 won't work. I'm redesigning my cluster configuration so I don't require cloned/highly-available IP addresses. Is this a bug? Is there a bugzilla or

Re: [Linux-HA] Writing a stonith-ng fencing agent in perl

2012-02-23 Thread Andrew Beekhof
On Thu, Feb 23, 2012 at 11:08 AM, William Seligman selig...@nevis.columbia.edu wrote: On 2/22/12 6:20 PM, Andrew Beekhof wrote: On Thu, Feb 23, 2012 at 8:21 AM, William Seligman selig...@nevis.columbia.edu wrote: About a 1.5 years ago, I wrote a fencing agent for Pacemaker 1.0.x; it used NUT

Re: [Linux-HA] Writing a stonith-ng fencing agent in perl

2012-02-23 Thread Andrew Beekhof
On Fri, Feb 24, 2012 at 9:56 AM, William Seligman selig...@nevis.columbia.edu wrote: The real reason the perl-scripted fencing agents don't give the correct response to stonith-admin is that they're looking for a action=XXX parameter from stdin, when the actual parameter being passed is

Re: [Linux-HA] Writing a stonith-ng fencing agent in perl

2012-02-22 Thread Andrew Beekhof
On Thu, Feb 23, 2012 at 8:21 AM, William Seligman selig...@nevis.columbia.edu wrote: About a 1.5 years ago, I wrote a fencing agent for Pacemaker 1.0.x; it used NUT to shut down power on a UPS: http://www.mail-archive.com/pacemaker@oss.clusterlabs.org/msg05942.html I'm building a new HA

Re: [Linux-ha-dev] [PATCH] Medium: Use the resource timeout as an override to the default dbus timeout for upstart RA

2012-02-20 Thread Andrew Beekhof
On Sat, Feb 18, 2012 at 12:00 AM, Ante Karamatic iv...@ubuntu.com wrote: On 17.02.2012 11:20, Andrew Beekhof wrote: Tangential question... but does upstart also implement the service binary? As in service pacemaker start ? It does, but the exit status is always '0', which makes 'service

Re: [Linux-ha-dev] [PATCH] Medium: Use the resource timeout as an override to the default dbus timeout for upstart RA

2012-02-17 Thread Andrew Beekhof
Tangential question... but does upstart also implement the service binary? As in service pacemaker start ? On Fri, Feb 17, 2012 at 6:52 PM, Ante Karamatic iv...@ubuntu.com wrote: # HG changeset patch # User Ante Karamatić ante.karama...@canonical.com # Date 1329463546 -3600 # Node ID

Re: [Linux-HA] MMM conflict with Pacemaker

2012-02-17 Thread Andrew Beekhof
On Fri, Feb 17, 2012 at 4:00 AM, Mark Grennan m...@grennan.com wrote: Hi Marcus, One Issue I can think of is, Pacemaker wants to bind the floating IP as eth#:#, while MMM wants to use a different method that can only be seen with the IP command.   I think they are fighting over who owns the

Re: [Linux-HA] pacemaker/corosync - cl_status . REASON: hb_api_signon: Can't initiate connection to heartbeat

2012-02-16 Thread Andrew Beekhof
, Thomas. -Ursprüngliche Nachricht- Von: linux-ha-boun...@lists.linux-ha.org [mailto:linux-ha-boun...@lists.linux-ha.org] Im Auftrag von Andrew Beekhof Gesendet: Mittwoch, 15. Februar 2012 11:13 An: General Linux-HA mailing list Betreff: Re: [Linux-HA] pacemaker/corosync - cl_status

Re: [Linux-HA] Understanding the behavior of IPaddr2 clone

2012-02-16 Thread Andrew Beekhof
On Fri, Feb 17, 2012 at 5:05 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: Hi, On Wed, Feb 15, 2012 at 04:24:15PM -0500, William Seligman wrote: On 2/10/12 4:53 PM, William Seligman wrote: I'm trying to set up an Active/Active cluster (yes, I hear the sounds of kittens dying).

Re: [Linux-HA] pacemaker/corosync - cl_status . REASON: hb_api_signon: Can't initiate connection to heartbeat

2012-02-15 Thread Andrew Beekhof
On Wed, Feb 15, 2012 at 5:50 PM, Florian Haas flor...@hastexo.com wrote: On 02/14/12 03:09, Andrew Beekhof wrote: On Tue, Feb 14, 2012 at 7:26 AM, Thomas Baumann t...@tiri.li wrote: Hello list, In my current pacemaker/corosync installation in a 2 node cluster I get following error

Re: [Linux-HA] pacemaker/corosync - cl_status . REASON: hb_api_signon: Can't initiate connection to heartbeat

2012-02-13 Thread Andrew Beekhof
On Tue, Feb 14, 2012 at 7:26 AM, Thomas Baumann t...@tiri.li wrote: Hello list, In my current pacemaker/corosync installation in a 2 node cluster I get following error: # cl_status listnodes This is a heartbeat command, you're running corosync Try crm_node -p cl_status[3681]:

Re: [Linux-HA] FW: How DC is selected?

2012-02-05 Thread Andrew Beekhof
The location of the DC is an internal detail which you shouldn't care about. Why do you want to be able to predetermine its location? On Sun, Feb 5, 2012 at 1:53 PM, Mayank mayank.mittal.1...@hotmail.com wrote: Hello all,   I'm using pacemaker to manage our some of resources including Virtual

Re: [Linux-HA] Status about ocfs2.pcmk ?

2012-02-05 Thread Andrew Beekhof
On Fri, Feb 3, 2012 at 7:29 PM, alain.mou...@bull.net wrote: Hi Andreas , thanks for your response, but two questions : 1/ why going with GFS2 ? because you know that ocfs2+pacemaker still does not    work fine on rhel ? or ... ? I wouldn't hold my breath waiting for OCFS2 to be supported

Re: [Linux-HA] RA timeouts / Remote node did not respond

2012-01-30 Thread Andrew Beekhof
On Mon, Jan 30, 2012 at 9:28 PM, Sascha Reimann sascha.reim...@hostway.de wrote: Hi Dejan, thanks for hints! It's indeed a bigger cluster with currently 9 nodes and approximately 50 resources, but we've planned to build an even bigger one, which will probably not possible due to the

Re: [Linux-HA] crmsh property management regression

2012-01-16 Thread Andrew Beekhof
On Tue, Jan 17, 2012 at 4:56 AM, Dejan Muhamedagic deja...@fastmail.fm wrote: On Mon, Jan 16, 2012 at 06:47:54PM +0300, Vladislav Bogdanov wrote: Hi Dejan, thank you very much for a good pointer, you saved me much time. 16.01.2012 16:20, Dejan Muhamedagic wrote: Hi Vladislav, On Mon,

Re: [Linux-HA] The active trap of the SNMP is delayed.

2012-01-10 Thread Andrew Beekhof
...@suse.com wrote: Hi Andrew, On 11/30/11 19:01, Gao,Yan wrote: Hi Andrew, On 11/28/11 07:53, Andrew Beekhof wrote: On Thu, Nov 24, 2011 at 7:50 PM, Gao,Yan y...@suse.com wrote: Hi Hideo, On 11/24/11 15:48, renayama19661...@ybb.ne.jp wrote: Hi Yan, About this matter, were you

Re: [Linux-HA] Light Weight Quorum Arbitration

2012-01-04 Thread Andrew Beekhof
Hi Tanja, Is $(DTDROOT)/make/common.mk available somewhere so we can try building these? On Thu, Jan 5, 2012 at 2:31 AM, Tanja Roth tar...@suse.de wrote: Hi, On 2011-12-06 10:33 Florian Haas flor...@hastexo.com wrote: On Tue, Dec 6, 2011 at 9:50 AM, Lars Marowsky-Bree l...@suse.com wrote: On

Re: [Linux-HA] Question or problem around migration

2011-12-20 Thread Andrew Beekhof
On Wed, Dec 21, 2011 at 12:50 AM, Dan Frincu df.clus...@gmail.com wrote: Hi, On Tue, Dec 20, 2011 at 3:35 PM,  alain.mou...@bull.net wrote: Ooops, sorry, the behavior is not the same, you were true : with cluster-recheck-interval=90 crm resource migrate group1 node2 P300S migration is quite

Re: [Linux-HA] Antw: Re: Q on http://clusterlabs.org/wiki/FAQ#Resource_is_Too_Active

2011-12-18 Thread Andrew Beekhof
On Fri, Dec 16, 2011 at 11:34 PM, Ulrich Windl ulrich.wi...@rz.uni-regensburg.de wrote: Dominik Klein dominik.kl...@googlemail.com schrieb am 16.12.2011 um 12:34 in Nachricht 4eeb2cdb.6020...@googlemail.com: On 12/15/2011 11:19 AM, Ulrich Windl wrote: Hi! I have a problem with some

Re: [Linux-ha-dev] [Patch] OCF_RESKEY_CRM_meta_timeout not matching monitor timeout meta-data.

2011-12-15 Thread Andrew Beekhof
On Thu, Dec 15, 2011 at 8:45 PM, renayama19661...@ybb.ne.jp wrote: Hi Dejan, Thank you for comment. It looks like a wrong place for a fix. Shouldn't crmd send all environment? It is only by chance that we have the timeout value available in this function. In the case of stop, crmd does

Re: [Linux-ha-dev] [Patch] OCF_RESKEY_CRM_meta_timeout not matching monitor timeout meta-data.

2011-12-15 Thread Andrew Beekhof
which resulted in:     https://github.com/ClusterLabs/pacemaker/commit/fcfe6fe522138343e4138248829926700fac213e All right. Will you apply this correction to 1.0 of Pacemaker? Sure. We'll pick it up for .13 Best Regards, Hideo Yamauchi. --- On Fri, 2011/12/16, Andrew Beekhof

Re: [Linux-HA] resource unmanaged/failed

2011-12-11 Thread Andrew Beekhof
On Fri, Dec 9, 2011 at 7:46 PM, Aleksey V. Kashin aleksey.kas...@gmail.com wrote: How much do they have now? They have 12G RAM. That seems respectable. How much is in use by the radius servers?                   total       used       free     shared    buffers     cached Mem:        

Re: [Linux-HA] About user and passwd encoding in cib

2011-12-08 Thread Andrew Beekhof
Fencing agents in RHEL already support the password-script parameter (which could conceivably even query ldap or a database). What is your use-case? On Thu, Dec 8, 2011 at 11:41 PM, alain.mou...@bull.net wrote: Hi Dejan, ok but I remember that on RHEL, Red-Hat removes several things in

Re: [Linux-HA] Pacemaker : Pb on stop on a resource while the monitoring is performed

2011-12-08 Thread Andrew Beekhof
On Tue, Nov 29, 2011 at 12:54 AM, alain.mou...@bull.net wrote: Hi I always have this problem. Just a little question : when this occurs, meaning a monitoring happening whereas there is just a crm command request on the resource i.e. migration, why not just return SUCCESS so that the next

Re: [Linux-HA] resource unmanaged/failed

2011-12-08 Thread Andrew Beekhof
On Wed, Dec 7, 2011 at 9:56 PM, Aleksey V. Kashin aleksey.kas...@gmail.com wrote: I can't increase ram on this servers. How can I do that resource isn't becomes unmanaged/failed ? How much do they have now? How much is in use by the radius servers?

Re: [Linux-ha-dev] In RHEL5 and RHEL6 about different HA_RSCTMP

2011-12-07 Thread Andrew Beekhof
On Tue, Dec 6, 2011 at 8:00 PM, nozawat noza...@gmail.com wrote: Hi  A maintenance mode in the heartbeat-stack does not work by this difference now in RHEL6.  The reason is because /var/run/heartbeat/rsctmp is deleted at the time of initialization of Heartbeat. Right, but the location and

<    1   2   3   4   5   6   7   8   9   10   >