Re: [Linux-ha-dev] Re: [Linux-HA] eDirectory RA contribution

2007-04-19 Thread yan
Quoting Lars Marowsky-Bree [EMAIL PROTECTED]: On 2007-04-18T16:22:53, [EMAIL PROTECTED] wrote: Attached. Was meant to send it first time round. doh. Patch is against last version you sent me. For Alan's benefit, new verbatim version attached (0.11) Thanks! I merged it and pushed it out to

Re: [Linux-ha-dev] transition graphs during fail-over process

2007-04-19 Thread Andrew Beekhof
On 4/19/07, Junko IKEDA [EMAIL PROTECTED] wrote: Hi, This is not a serious problem but I just take notice of this, so please let me know whether this is a common behavior for Heartbeat or not, if you know anything about it. There are two nodes, a virtual IP (IPaddr) is running on one of them.

Re: [Linux-ha-dev] start-delay parameter for monitor operation and Split-Brain

2007-04-19 Thread Andrew Beekhof
On 4/18/07, 池田淳子 [EMAIL PROTECTED] wrote: Hi Andrew, I'm sorry to ask a lot of questions at a time... Let's just put it this way. I just try to replicate the circumstances that is a temporary blackout of the interconnect LAN. When some nodes resolve their Split-Brain, (1) If the LAN recovers

Re: [Linux-ha-dev] transition graphs during fail-over process

2007-04-19 Thread Andrew Beekhof
On 4/19/07, Junko IKEDA [EMAIL PROTECTED] wrote: -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Andrew Beekhof Sent: Thursday, April 19, 2007 4:29 PM To: High-Availability Linux Development List Subject: Re: [Linux-ha-dev] transition graphs during

[Linux-ha-dev] [patch 3/5] devname in scan_if() is too short

2007-04-19 Thread Horms
devname is passed to scanf which will fill in a string of up to 20 bytes + trailing '\0'. So make devname 21 bytes long accordingly. Index: lha-STABLE_1_2-ipv6addr/heartbeat/resource.d/IPv6addr.c === ---

[Linux-ha-dev] [patch 2/5] overrun in find_if() for 128bit prefixes

2007-04-19 Thread Horms
while reading over the IPv6addr code I notices that there is an overrun in find_if() in the case where the prefix is 128. In this case, mask.s6_addr[16] will be accessed twice, but that array only has 16 elements. The patch below takes the simple approach of just treating 128 as a corner case and

[Linux-ha-dev] macros vs. functions

2007-04-19 Thread Bernd Schubert
Hi, while looking into the source due to our recent problems, I see there are several macros, which could be replaced by static (inline) functions, e.g. in GSource.c. Is there a reason to use macros? Do you mind if I convert this into functions? Thanks, Bernd -- Bernd Schubert Q-Leap

Re: [Linux-ha-dev] macros vs. functions

2007-04-19 Thread David Lee
On Thu, 19 Apr 2007, Bernd Schubert wrote: while looking into the source due to our recent problems, I see there are several macros, which could be replaced by static (inline) functions, e.g. in GSource.c. Is there a reason to use macros? Do you mind if I convert this into functions? My

Re: [Linux-ha-dev] macros vs. functions

2007-04-19 Thread Andrew Beekhof
i dont see any reason _not_ to make these particular macros functions... they're only used in the c-file that defines them and not being used to break out of loops or anything. On 4/19/07, Bernd Schubert [EMAIL PROTECTED] wrote: On Thursday 19 April 2007 14:10:45 David Lee wrote: On Thu, 19

Re: [Linux-ha-dev] macros vs. functions

2007-04-19 Thread Simon Horman
On Thu, Apr 19, 2007 at 02:37:47PM +0200, Andrew Beekhof wrote: i dont see any reason _not_ to make these particular macros functions... they're only used in the c-file that defines them and not being used to break out of loops or anything. I'll throw my 2c worth, which is that I also think

Re: [Linux-HA] set of apache servers + a service IP

2007-04-19 Thread Andrew Beekhof
On 4/18/07, Jose Jerez [EMAIL PROTECTED] wrote: I have been using heartbeat v2 for some time now and a happy customer I am :-) but I need your help for a configuration a little bit more complex. The system is SLES-10 and heartbeat 2.0.7 We have a group of apache servers each one of them in a

Re: [Linux-HA] status check

2007-04-19 Thread Andrew Beekhof
On 4/18/07, Lars Marowsky-Bree [EMAIL PROTECTED] wrote: On 2007-04-17T19:40:13, [EMAIL PROTECTED] wrote: Easiest way is to model after an existing resource agent, Xen for example. I've found the Dummy one a good start in the past. Simple, and shows the basic required components. Yeah, but

[Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Peter Kruse
Hello, thanks for reading this, as it's with ancient v2.0.5., please tell me that this problem can not happen with recent version of heartbeat. Problem description: yesterday in one of our 2node HA-Clusters a successful takeover happened, where the failed node was resetted, so far so good.

Re: [Linux-HA] Distinguish probe and monitor

2007-04-19 Thread Keisuke MORI
Alan Robertson [EMAIL PROTECTED] writes: Andrew Beekhof wrote: On 4/17/07, Keisuke MORI [EMAIL PROTECTED] wrote: Hi, The official document on the web mentions about how to distinguish between probe and monitor that you can tell by referring to the value of OCF_RESKEY_CRM_meta_interval as

Re: [Linux-HA] IPv6, service fail on start behaviour

2007-04-19 Thread Benjamin Watine
No idea about my questions ? Benjamin Watine a écrit : Hi all I have two questions about Heartbeat v2 configuration : 1. IPv6addr : I've tried to configure virtual IPv6 address for a resource group. Because I didn't find documentation about this script, I did it like IPaddr, but it don't

Re: [Linux-HA] set of apache servers + a service IP

2007-04-19 Thread Jose Jerez
Thanks Andrew I'll give it a try, or maybe wait for that service pack and ask another question in the list (due soon) ;-) On 4/19/07, Andrew Beekhof [EMAIL PROTECTED] wrote: On 4/18/07, Jose Jerez [EMAIL PROTECTED] wrote: I have been using heartbeat v2 for some time now and a happy customer I

Re: [Linux-HA] status check

2007-04-19 Thread Andrew Beekhof
On 4/19/07, Dejan Muhamedagic [EMAIL PROTECTED] wrote: On Thu, Apr 19, 2007 at 10:07:09AM +0200, Andrew Beekhof wrote: On 4/18/07, Lars Marowsky-Bree [EMAIL PROTECTED] wrote: On 2007-04-17T19:40:13, [EMAIL PROTECTED] wrote: Easiest way is to model after an existing resource agent, Xen for

Re: [Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Andrew Beekhof
On 4/19/07, Peter Kruse [EMAIL PROTECTED] wrote: Hello, thanks for reading this, as it's with ancient v2.0.5., please tell me that this problem can not happen with recent version of heartbeat. Problem description: yesterday in one of our 2node HA-Clusters a successful takeover happened, where

Re: [Linux-HA] Score, Resource stickiness Problems

2007-04-19 Thread Andrew Beekhof
On 4/18/07, Serge Dewailly [EMAIL PROTECTED] wrote: Hi all, I think I'm doing something wrong, but after many serach can't where I'm going wrong... I'm working on a two nodes setup wit hdrbd + filesystem + xen virtual machines. I made a group for each xen resources : group1 = drbd0 +

Re: [Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Peter Kruse
Hi Andrew! Andrew Beekhof wrote: beosrv-c-2 is the failed node right? it was beosrv-c-1 that failed, beosrv-c-2 took over. do you have logs from there too? attached (messages about Gmain_timeout removed, there were too many of them) The problem now is that cibadmin -m reports: CIB on

Re: [Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Andrew Beekhof
On 4/19/07, Peter Kruse [EMAIL PROTECTED] wrote: Hi Andrew! Andrew Beekhof wrote: beosrv-c-2 is the failed node right? it was beosrv-c-1 that failed, beosrv-c-2 took over. then i'm afraid your use of the dont fence nodes on startup option has come back to haunt you beosrv-c-1 came up but

Re: [Linux-HA] IPv6, service fail on start behaviour

2007-04-19 Thread Andrew Beekhof
On 4/18/07, Benjamin Watine [EMAIL PROTECTED] wrote: Hi all I have two questions about Heartbeat v2 configuration : 1. IPv6addr : I've tried to configure virtual IPv6 address for a resource group. Because I didn't find documentation about this script, I did it like IPaddr, but it don't seems

Re: [Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Peter Kruse
Andrew Beekhof wrote: then i'm afraid your use of the dont fence nodes on startup option has come back to haunt you beosrv-c-1 came up but was not able to find beosrv-c-2 (even though it _was_ running) and because of that option beosrv-c-1 just pretended beosrv-c-2 wasn't running and happily

Re: [Linux-HA] IPv6, service fail on start behaviour

2007-04-19 Thread Benjamin Watine
Andrew Beekhof a écrit : On 4/18/07, Benjamin Watine [EMAIL PROTECTED] wrote: Hi all I have two questions about Heartbeat v2 configuration : 1. IPv6addr : I've tried to configure virtual IPv6 address for a resource group. Because I didn't find documentation about this script, I did it like

Re: [Linux-HA] Restarting a resource that failed to start

2007-04-19 Thread Andrew Beekhof
On 4/13/07, Piotr Kaczmarzyk [EMAIL PROTECTED] wrote: Hi, I'm using version 2.0.8 and I tried to provide a highly-available squid service. I wrote my own OCF script which was tested in two versions: ver 1. 'Start' function started squid, waited a few seconds, then tried to connect to port

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-04-19 Thread Martin Fick
--- Doug Knight [EMAIL PROTECTED] wrote: I tried setting up colocation constraints similar to those shown in the example referenced in the URL above, and it complained about the identical ids: ... I'm going to change the ids to be unique and try again, but wanted to point this out since it

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-04-19 Thread Doug Knight
I made the ID change indicated below (for the colocation constraints), and everything configured fine using cibadmin. Now, I started JUST the drbd master/slave resource, with the rsc_location rule setting the expression uname to one of the two nodes in the cluster. Both drbd processes come up and

Re: [Linux-HA] R2 Two-node apache cluster with STONITH

2007-04-19 Thread Dejan Muhamedagic
On Tue, Apr 17, 2007 at 03:55:07PM -0400, Bjorn Oglefjorn wrote: Alan, what is the list operation? The node names are always FQDNs and always match. Do they? From your CIB: primitive id=test-1_DRAC class=stonith type=external/drac4 provider=heartbeat operations op

Re: [Linux-HA] Distinguish probe and monitor

2007-04-19 Thread Peter Kruse
Hello, thanks for this discussion. Andrew Beekhof wrote: On 4/19/07, Peter Kruse [EMAIL PROTECTED] wrote: the PE makes zero distinction between them and since it's the one doing the asking i believe that it is its meaning that counts. yes, think so, too. both ask the same question: Is

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-04-19 Thread Martin Fick
Hi Doug, I personally could not get the DRBD OCF to work, I am using drbd .7x, what about you? I never tried a master/slave setup though. I created my own drbd OCF, it is on my site along with the CIB scripts. http://www.theficks.name/bin/lib/ocf/drbd You can even use the drbd CIBS as a

Re: [Linux-HA] Distinguish probe and monitor

2007-04-19 Thread Alan Robertson
Peter Kruse wrote: Hello, thanks for this discussion. Andrew Beekhof wrote: On 4/19/07, Peter Kruse [EMAIL PROTECTED] wrote: the PE makes zero distinction between them and since it's the one doing the asking i believe that it is its meaning that counts. yes, think so, too. both

Re: [Linux-HA] status check

2007-04-19 Thread Alan Robertson
Andrew Beekhof wrote: On 4/18/07, Lars Marowsky-Bree [EMAIL PROTECTED] wrote: On 2007-04-17T19:40:13, [EMAIL PROTECTED] wrote: Easiest way is to model after an existing resource agent, Xen for example. I've found the Dummy one a good start in the past. Simple, and shows the basic

Re: [Linux-HA] BadThingsHappen with v2.0.5.

2007-04-19 Thread Alan Robertson
Peter Kruse wrote: Andrew Beekhof wrote: then i'm afraid your use of the dont fence nodes on startup option has come back to haunt you beosrv-c-1 came up but was not able to find beosrv-c-2 (even though it _was_ running) and because of that option beosrv-c-1 just pretended beosrv-c-2 wasn't

Re: [Linux-HA] standalone pingd.sh

2007-04-19 Thread Carson Gaspar
Xinwei Hu wrote: The known issue is that I don't know how to daemonize in bash, so the pingd RA needs a little tweak also. You can't daemonize in bash, unless your OS comes with some executable that daemonizes arbitrary programs (I think some flavours of Linux do). -- Carson

Re: [Linux-HA] resource start not respecting location constraint

2007-04-19 Thread Alan Robertson
Yan Fitterer wrote: In the attached pe-warn, why is resource R_audit being started on idm01 when there is an INFINITY constraint with uname eq idm04? BTW - idm04 is in standby at the moment. That should hardly matter. I expect the resource to be cannot run anywhere. I really hope it's not

Re: [Linux-HA] transition graphs during fail-over process

2007-04-19 Thread Alan Robertson
Junko IKEDA wrote: Hi, This is not a serious problem but I just take notice of this, so please let me know whether this is a common behavior for Heartbeat or not, if you know anything about it. There are two nodes, a virtual IP (IPaddr) is running on one of them. If the IPaddr is taken

Re: [Linux-HA] standalone pingd.sh

2007-04-19 Thread Carson Gaspar
Alan Robertson wrote: Carson Gaspar wrote: Xinwei Hu wrote: The known issue is that I don't know how to daemonize in bash, so the pingd RA needs a little tweak also. You can't daemonize in bash, unless your OS comes with some executable that daemonizes arbitrary programs (I think some