Re: [Linux-HA] How to add resources into crm through crm admin tools?

2007-05-03 Thread Alan Robertson
Tao Yu wrote: > Thanks for the information! > > By doing that, I guess the resource will be added as the last one. Is that > correct? > Is there any control on the order? You can do anything you want, it's just that getting picky means becoming more clever. You can replace subtrees in the XML tr

Re: [Linux-HA] Xen-HA on SLES x86_64

2007-05-03 Thread Alan Robertson
Rene Purcell wrote: > yeah ok so as I can see in the src code of the ocf xen module.. he do a xm > list and check if the vm name contained in the xenfile is running.. > > so even if it's 2 different vm running on each node if their xen name are > both vm01 am I wrong to think that the ressource a

Re: [Linux-HA] Stonith in 2 node setup

2007-05-03 Thread Alan Robertson
Benjamin Lawetz wrote: > Hi, > > I've been running some tests on my heartbeat setup with STONITH. > When I go live, there will a serial connection, a crossover ethernet and the > main ethernet for heartbeat. For the purpose of my tests, I've changed the > config just to broadcast the heartbe

Re: [Linux-HA] Xen-HA on SLES x86_64

2007-05-03 Thread Rene Purcell
yeah ok so as I can see in the src code of the ocf xen module.. he do a xm list and check if the vm name contained in the xenfile is running.. so even if it's 2 different vm running on each node if their xen name are both vm01 am I wrong to think that the ressource agent will not see the differe

Re: [Linux-HA] Xen-HA on SLES x86_64

2007-05-03 Thread Alan Robertson
Rene Purcell wrote: > I've already read this document, with this method it's working.. they have > two VM and each node can access these VM to start it.. they are on a iscsi > "a fake SAN" > > in My question I was trying to see if it's possible to have two > different VM > on each node, with the s

Re: [Linux-HA] Cannot locate resource script

2007-05-03 Thread Alan Robertson
Lee Hinman wrote: >> > Hi Everyone, >> > For some reason, when heartbeat is started, it logs an error over and >> > over and over in the logs about failing to find the resource script >> > lava2042 (lava2042 is the hostname of the machine). >> > >> > Here's the error I'm seeing: >> > >> > May 2 16

[Linux-HA] Stonith in 2 node setup

2007-05-03 Thread Benjamin Lawetz
Hi, I've been running some tests on my heartbeat setup with STONITH. When I go live, there will a serial connection, a crossover ethernet and the main ethernet for heartbeat. For the purpose of my tests, I've changed the config just to broadcast the heartbeat on the crossover cable.

[Linux-HA] How many copies of attrd should be running?

2007-05-03 Thread Doug Knight
Hi all, Should there be more than one copy of attrd running on a node at the same time? I've discovered the two nodes in the cluster are not acting the same, and so far the only difference I can find is that the one with issues has the following heartbeat processes: (Current DC node) nobody135

Re: [Linux-HA] How to add resources into crm through crm admin tools?

2007-05-03 Thread Tao Yu
Thanks for the information! By doing that, I guess the resource will be added as the last one. Is that correct? Is there any control on the order? Thank. On 5/3/07, Benjamin Lawetz <[EMAIL PROTECTED]> wrote: Using cibadmin For example to create a node: Create a file node.xml containing your

Re: [Linux-HA] Cannot locate resource script

2007-05-03 Thread Lee Hinman
> Hi Everyone, > For some reason, when heartbeat is started, it logs an error over and > over and over in the logs about failing to find the resource script > lava2042 (lava2042 is the hostname of the machine). > > Here's the error I'm seeing: > > May 2 16:31:54 lava2042 ResourceManager[10454]: [

RE: [Linux-HA] How to add resources into crm through crm admin tools?

2007-05-03 Thread Benjamin Lawetz
Using cibadmin For example to create a node: Create a file node.xml containing your node definition ex: Then run cibadmin -C -o nodes -X node.xml Deleteing,updating of resources can also be done with cibadmin. Juste use cibadmin --help to find out more.

[Linux-HA] How to add resources into crm through crm admin tools?

2007-05-03 Thread Tao Yu
Hi, Sorry for my beginner's question. Is it possible to add resources using crm admin tools? I read the man pages for crmadmin, crm_resources, but can't see any command to do this. Did I miss anything? Thanks, Tao ___ Linux-HA mailing list Linux-HA@l

RE: [Linux-HA] External STONITH timeout

2007-05-03 Thread Benjamin Lawetz
Default_action_timeout did not seem to make a difference, but changing the cluster-delay did manage to change the timeout of the stonith. This: Gave me a 60s timeout waiting for the Stonith (version 2.0.8). But unfortunately the problem was elsewhere :-( Thank for putting me on the right track D

Re: [Linux-HA] NFS server not started by heartbeat

2007-05-03 Thread Alan Robertson
Martijn Grendelman wrote: > Hi, > > I am trying to build a 2-node cluster serving DRBD+NFS, among other > things. It has been operational on Debian Sarge, with Heartbeat 1.2, but > recently, both machines were upgraded to Debian Etch, and today I > upgraded Heartbeat to 2.0.7. I maintained the R1

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-05-03 Thread Alan Robertson
Dejan Muhamedagic wrote: > On Thu, May 03, 2007 at 09:26:56AM -0400, Doug Knight wrote: >> Hmm, kill -9 on the active node is not sufficient to simulate a node >> going down. Heartbeat goes away, but the file system remains mounted and >> drbd remains primary on what was the active node. > > Is t

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-05-03 Thread Doug Knight
On Thu, 2007-05-03 at 16:12 +0200, Dejan Muhamedagic wrote: > On Thu, May 03, 2007 at 09:08:12AM -0400, Doug Knight wrote: > > Thanks Dejan, I'll try the kill -9. One thing I'm seeing is that I can > > easily move the resources between nodes using the constraint, > > but if I shutdown heartbeat o

[Linux-HA] NFS server not started by heartbeat

2007-05-03 Thread Martijn Grendelman
Hi, I am trying to build a 2-node cluster serving DRBD+NFS, among other things. It has been operational on Debian Sarge, with Heartbeat 1.2, but recently, both machines were upgraded to Debian Etch, and today I upgraded Heartbeat to 2.0.7. I maintained the R1 style configuration. Heartbeat is runn

Re: [Linux-HA] Xen-HA on SLES x86_64

2007-05-03 Thread Rene Purcell
I've already read this document, with this method it's working.. they have two VM and each node can access these VM to start it.. they are on a iscsi "a fake SAN" in My question I was trying to see if it's possible to have two different VM on each node, with the same name.. VM01 and VM02 on node1

Re: [Linux-HA] RA return code for "running but broken"

2007-05-03 Thread Lars Marowsky-Bree
On 2007-05-03T09:25:12, Yan Fitterer <[EMAIL PROTECTED]> wrote: > What return code should an OCF RA return on a monitor operation when > the service is "running but broken" (for ex. process present, but > services not available)? > > If the RA returns OCF_NOT_RUNNING, then will hb do a "stop" bef

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-05-03 Thread Dejan Muhamedagic
On Thu, May 03, 2007 at 09:08:12AM -0400, Doug Knight wrote: > Thanks Dejan, I'll try the kill -9. One thing I'm seeing is that I can > easily move the resources between nodes using the constraint, > but if I shutdown heartbeat on one node (/etc/init.d/heartbeat stop) I > run into problems. If I s

Re: [Linux-HA] standalone pingd.sh

2007-05-03 Thread Alan Robertson
David Lee wrote: > On Mon, 30 Apr 2007, David Lee wrote: > >> [...] >> We already have such code, and already have it duplicated (ouch!) in >> "resources/OCF/IPaddr.in" and "resources/heartbeat/IPaddr.in". And >> "pingd.sh" is in danger of making this triplicate. >> [...] > > Following my own em

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-05-03 Thread Dejan Muhamedagic
On Thu, May 03, 2007 at 09:26:56AM -0400, Doug Knight wrote: > Hmm, kill -9 on the active node is not sufficient to simulate a node > going down. Heartbeat goes away, but the file system remains mounted and > drbd remains primary on what was the active node. Is there a way to force umount a drbd

Re: [Linux-HA] standalone pingd.sh

2007-05-03 Thread David Lee
On Mon, 30 Apr 2007, David Lee wrote: > [...] > We already have such code, and already have it duplicated (ouch!) in > "resources/OCF/IPaddr.in" and "resources/heartbeat/IPaddr.in". And > "pingd.sh" is in danger of making this triplicate. > [...] Following my own email above, and going to a slig

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-05-03 Thread Doug Knight
Hmm, kill -9 on the active node is not sufficient to simulate a node going down. Heartbeat goes away, but the file system remains mounted and drbd remains primary on what was the active node. On Thu, 2007-05-03 at 09:08 -0400, Doug Knight wrote: > Thanks Dejan, I'll try the kill -9. One thing I'm

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-05-03 Thread Doug Knight
Thanks Dejan, I'll try the kill -9. One thing I'm seeing is that I can easily move the resources between nodes using the constraint, but if I shutdown heartbeat on one node (/etc/init.d/heartbeat stop) I run into problems. If I shutdown the node with the active resources, heartbeat migrates the DR

Re: [Linux-HA] Cannot locate resource script

2007-05-03 Thread Alan Robertson
Lee Hinman wrote: > Hi Everyone, > For some reason, when heartbeat is started, it logs an error over and > over and over in the logs about failing to find the resource script > lava2042 (lava2042 is the hostname of the machine). > > Here's the error I'm seeing: > > May 2 16:31:54 lava2042 Resour

Re: [Linux-HA] RA return code for "running but broken"

2007-05-03 Thread Alan Robertson
Yan Fitterer wrote: > What return code should an OCF RA return on a monitor operation when > the service is "running but broken" (for ex. process present, but > services not available)? > > If the RA returns OCF_NOT_RUNNING, then will hb do a "stop" before > any "start" when going from "unmanaged"

Re: [Linux-HA] Dispatch function delayed warnings in log

2007-05-03 Thread Alan Robertson
Dejan Muhamedagic wrote: > On Fri, Apr 27, 2007 at 11:37:52AM -0400, Doug Knight wrote: >> I'm getting the following warnings in the log, is it something I should >> investigate or is it not to worry? I've seen 14 in the last 18 hours, on >> a pair of fairly lightly loaded development servers, with

Re: [Linux-HA] Dispatch function delayed warnings in log

2007-05-03 Thread Dejan Muhamedagic
On Fri, Apr 27, 2007 at 11:37:52AM -0400, Doug Knight wrote: > I'm getting the following warnings in the log, is it something I should > investigate or is it not to worry? I've seen 14 in the last 18 hours, on > a pair of fairly lightly loaded development servers, with no pattern of > when they occ

Re: [Linux-HA] ERROR: parse_xml: Expected: action - HB 2.0.8

2007-05-03 Thread Dejan Muhamedagic
On Wed, May 02, 2007 at 09:31:16AM +1000, Alex Strachan wrote: > Hi Alan, > > Please excuse my ignorance but could you expand on 'meta-data operation'. > > Is there meta-data operations that I need to do when start/stop or > monitoring a resource, or is it a definition that need changing? The me

Re: [Linux-HA] Problems with ldirectord - masq and gate on same real ip:port

2007-05-03 Thread Simon Horman
On Thu, May 03, 2007 at 12:27:23PM +0200, Kristoffer Egefelt wrote: > Hi Horms, > > Thanks for the update, but when we try this version: > > >http://www.vergenet.net/~horms/linux/ldirectord/download/ldirectord.2007-05-01.e022c4b33b0e > > we get: > > > TCP 192.168.0.5:87 wrr > -> 10.10.11.87

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-05-03 Thread Dejan Muhamedagic
On Fri, Apr 27, 2007 at 03:10:22PM -0400, Doug Knight wrote: > I now have a working configuration with DRBD master/slave, and a > filesystem/pgsql/ipaddr group following it around. So far, I've been > using a Place constraint and modifying its uname value to test the "fail > over" of the resources.

Re: [Linux-HA] Cannot create group containing drbd using HB GUI

2007-05-03 Thread Dejan Muhamedagic
On Wed, May 02, 2007 at 11:25:10AM -0400, Doug Knight wrote: > Thanks Lars, that makes senses. So, to start and stop a master/slave > resource, do you recommend adding target_role=stopped to stop it, and > deleting target_role altogether to start it? Yes, I'd say that removing the target_role is a

Re: [Linux-HA] Problems with ldirectord - masq and gate on same real ip:port

2007-05-03 Thread Kristoffer Egefelt
Hi Horms, Thanks for the update, but when we try this version: http://www.vergenet.net/~horms/linux/ldirectord/download/ldirectord. 2007-05-01.e022c4b33b0e we get: TCP 192.168.0.5:87 wrr -> 10.10.11.87:87 Masq0 0 0 TCP 10.10.11.89:87 wrr -> 10.10.11.87:

Re: [Linux-HA] Possible bug? Heartbeat not assigning Slave status on resource startup

2007-05-03 Thread Andrew Beekhof
Started and Slave are basically the same state - so nothing is wrong as such - though it might be nice if it did in fact show Slave instead of Started. On 5/2/07, Doug Knight <[EMAIL PROTECTED]> wrote: When I initially start up a master_slave drbd resource (ms_dbrd_7788), using a Place constrain

Re: [Linux-HA] [ HELP ] pingd not failover (Active/Standy)

2007-05-03 Thread Andrew Beekhof
you cant use score_attribute and score in the same rule. in such cases score_attribute is ignored. when calling ptest, can you include the "-I filename" option which saves the input being used to a file and then attach it here please. On 5/3/07, chiu chun chir <[EMAIL PROTECTED]> wrote: Hi And

Re: [Linux-HA] External STONITH timeout

2007-05-03 Thread Andrew Beekhof
On 5/2/07, Dave Blaschke <[EMAIL PROTECTED]> wrote: Benjamin Lawetz wrote: > Hi all, > > I've written an external stonith plugin for a Sentry power switch. > It works fine from the command line (save that it takes 25s for the ssh to > login). This causes problems when the STONITH tries to k

Re: [Linux-HA] RA return code for "running but broken"

2007-05-03 Thread Andrew Beekhof
On 5/3/07, Yan Fitterer <[EMAIL PROTECTED]> wrote: What return code should an OCF RA return on a monitor operation when the service is "running but broken" (for ex. process present, but services not available)? OCF_ERR_GENERIC=1 If the RA returns OCF_NOT_RUNNING, then will hb do a "stop" b

Re: [Linux-HA] Xen-HA on SLES x86_64

2007-05-03 Thread Andrew Beekhof
hard to comment without seeing your config On 5/2/07, Rene Purcell <[EMAIL PROTECTED]> wrote: Hi all, just want to know if this kind of setup is possible with heartbeat. - There's two nodes. ( node1 and node2 ) - On each nodes there's two DomU ( vm01 on node 1 and vm01 on node2 ) they all have

[Linux-HA] RA return code for "running but broken"

2007-05-03 Thread Yan Fitterer
What return code should an OCF RA return on a monitor operation when the service is "running but broken" (for ex. process present, but services not available)? If the RA returns OCF_NOT_RUNNING, then will hb do a "stop" before any "start" when going from "unmanaged" to "managed" for example? W

RE: [Linux-HA] Xen-HA on SLES x86_64

2007-05-03 Thread Sander van Vugt
Have a look at this: http://www.novell.com/linux/technical_library/has.pdf, I think that document answers your questions. Regards, Sander > -Oorspronkelijk bericht- > Van: [EMAIL PROTECTED] [mailto:linux-ha- > [EMAIL PROTECTED] Namens Rene Purcell > Verzonden: woensdag 2 mei 2007 21:43 >