Re: [Linux-HA] Resource Failover but won't stick

2008-03-03 Thread b52
> Hello, > I have a couple XEN resources setup and work fine until a node gets > fenced. The resources fail over to the running node which is good. Then, > when the dead node is rebooted and comes back the XEN resources > immediately try to start on the rebooted node and fail, the node gets > fen

[Linux-HA] Very simple HA webserver setup

2008-03-03 Thread Brian Kirkbride
Hello all, I've been researching Linux-HA for a bit now and have been thoroughly impressed with the project. Thanks so much for contributing all of this effort. I have a few questions about a simple setup: * Node 1 is an active webserver with static content * Node 2 is a passive failover (s

[Linux-HA] Resource Failover but won't stick

2008-03-03 Thread Bryan Manzeck
Hello, I have a couple XEN resources setup and work fine until a node gets fenced. The resources fail over to the running node which is good. Then, when the dead node is rebooted and comes back the XEN resources immediately try to start on the rebooted node and fail, the node gets fenced again

Re: [Linux-HA] question: external/ssh stonith to poweroff badnode via xen-host

2008-03-03 Thread Serge Dubrouski
On Mon, Mar 3, 2008 at 12:08 PM, Lino Moragon <[EMAIL PROTECTED]> wrote: > Serge Dubrouski wrote: > > Configuration looks right to me, I even tested it and it worked fine > > on my test cluster. So hints are obvious: > > > > 1. Check that you really put that script on a second node and made it

Re: [Linux-HA] question: external/ssh stonith to poweroff badnode via xen-host

2008-03-03 Thread Lino Moragon
Serge Dubrouski wrote: Configuration looks right to me, I even tested it and it worked fine on my test cluster. So hints are obvious: 1. Check that you really put that script on a second node and made it executable. That was my first error, but i noticed a error message in the logfile and c

Re: [Linux-HA] weird drbd problem

2008-03-03 Thread Eddie C
I ran into that "Device is held open by someone" with DRBD master/master a lot. I believe it was a drbd bug. Once the system started reported that message there was NOTHING I was able to do other then reboot. This includes..Running all userspace "stop" like commands --- removing module (would not l

[Linux-HA] weird drbd problem

2008-03-03 Thread Dan Gahlinger
Getting this in /var/log/messages, and it's causing the system to reboot ResourceManager[12747]: debug: /etc/ha.d/resource.d/drbddisk r0 stop done. RC=11 ResourceManager[12747]: ERROR: Return code 11 from /etc/ha.d/resource.d/drbddisk ResourceManager[12747]: info: Retrying failed stop operation [d

Re: [Linux-HA] question: external/ssh stonith to poweroff badnode via xen-host

2008-03-03 Thread Serge Dubrouski
Configuration looks right to me, I even tested it and it worked fine on my test cluster. So hints are obvious: 1. Check that you really put that script on a second node and made it executable. 2. Nodes should be able to ping each other. That programmed in a "status" function. On Mon, Mar 3, 200

Re: [Linux-HA] Heartbeat reboots machine after "generic plugin load failed" error - please help

2008-03-03 Thread Lars Marowsky-Bree
On 2008-03-03T15:59:30, Luis Motta Campos <[EMAIL PROTECTED]> wrote: > Finally, I've investigated my connectivity between the two nodes, and > everything seems fine on the network layer: I can see the other machine > (both sides) and there is no packet filtering firewalls running on them > (and th

Re: [Linux-HA] ERROR: crm_abort: ha_set_tm_time: Triggered assertat iso8601.c:887

2008-03-03 Thread Sebastian Reitenbach
Lars Marowsky-Bree <[EMAIL PROTECTED]> wrote: > On 2008-02-29T08:30:37, Sebastian Reitenbach <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > I've seen these messages appearing when I connect the hb_gui to the mgmtd: > > > > mgmtd[6819]: 2008/02/29_08:03:51 ERROR: crm_abort: ha_set_tm_time: Trig

Re: [Linux-HA] ERROR: crm_abort: ha_set_tm_time: Triggered assert at iso8601.c:887

2008-03-03 Thread Lars Marowsky-Bree
On 2008-03-03T16:50:52, Luis Motta Campos <[EMAIL PROTECTED]> wrote: > For a "obviously fairly embarrassing bug", it's pretty complicated to > understand... :( > > Explanations and pointers to reading material are welcome. I'm not quite sure what you need; this is a bug in a date calculation cod

Re: [Linux-HA] Re: heartbeat shuts down all VM machines

2008-03-03 Thread rupert
On Fri, Feb 29, 2008 at 5:19 PM, rupert <[EMAIL PROTECTED]> wrote: > I did some google about the ucast errors, but not much info came arround. > > What can be the cause of this? I rebooted and/or restarted the > machines but always on both machines the log fills with the following > > > Feb 29

Re: [Linux-HA] question: external/ssh stonith to poweroff badnode via xen-host

2008-03-03 Thread Lino Moragon
Hi, I'm using now the most actual xen0 stonith plugin, that Serge attachted the 2008-02-28 to this thread. I thought I configured everything correct but it seems that the stonith clone cannot be started on my 2nd node. I must admit I configured the Clone via hb_gui but I still have some issues. As

Re: [Linux-HA] ERROR: crm_abort: ha_set_tm_time: Triggered assert at iso8601.c:887

2008-03-03 Thread Luis Motta Campos
Lars Marowsky-Bree wrote: > On 2008-02-29T08:30:37, Sebastian Reitenbach > <[EMAIL PROTECTED]> wrote: > >> Hi, >> >> I've seen these messages appearing when I connect the hb_gui to the >> mgmtd: >> >> mgmtd[6819]: 2008/02/29_08:03:51 ERROR: crm_abort: ha_set_tm_time: >> Triggered assert at iso86

Re: [Linux-HA] ERROR: crm_abort: ha_set_tm_time: Triggered assert at iso8601.c:887

2008-03-03 Thread Lars Marowsky-Bree
On 2008-02-29T08:30:37, Sebastian Reitenbach <[EMAIL PROTECTED]> wrote: > Hi, > > I've seen these messages appearing when I connect the hb_gui to the mgmtd: > > mgmtd[6819]: 2008/02/29_08:03:51 ERROR: crm_abort: ha_set_tm_time: Triggered > assert at iso8601.c:887 : rhs->tm_mday < 0 || lhs->days

Re: [Linux-HA] Heartbeat reboots machine after "generic plugin load failed" error - please help

2008-03-03 Thread Luis Motta Campos
Dominik Klein wrote: > Hi > > please try to change "crm on" to "crm respawn" in /etc/ha.d/ha.cf Hi Dominik. Thank you for the fast answer. Setting crm to "respawn" at least prevent the machines from rebooting, and gives me a fair chance to run other diagnostics. I'm sorry, I just realized that

Re: [Linux-HA] How to allow resources to ping-pong forever?

2008-03-03 Thread Christopher Barry
On Mon, 2008-03-03 at 10:47 +0100, Alex Spengler wrote: > Hi, > > I'm stuck in setting up my cluster. > What I want to achive is > - run apache on whatever node together with cluster IP which is > 172.23.100.200. > - if apache fails -> switch over to other node > - if gateway 172.23.100.1 is not r

Re: [Linux-HA] Heartbeat reboots machine after "generic plugin load failed" error - please help

2008-03-03 Thread Dominik Klein
Hi please try to change "crm on" to "crm respawn" in /etc/ha.d/ha.cf Regards Dominik Luis Motta Campos wrote: Hi Linux-HA list :) I'm running CentOS 5 (Linux 2.6.18-53.1.6.el5 #1 SMP Wed Jan 23 11:28:47 EST 2008 x86_64 GNU/Linux), and Heartbeat from the packages: heartbeat.x86_64

Re: [Linux-HA] external setting of node attributes?

2008-03-03 Thread b52
> Am Montag, 3. März 2008 14:54 schrieb Michael Schwartzkopff: >> Am Montag, 3. März 2008 14:42 schrieb Andreas Kurz: >> (...) >> >> > > 1) write a script that measures the load of every node. Take the 5 >> min >> > > average! See damping below for explanation. >> > >> > Or give the already inclu

Re: [Linux-HA] external setting of node attributes?

2008-03-03 Thread Michael Schwartzkopff
Am Montag, 3. März 2008 14:54 schrieb Michael Schwartzkopff: > Am Montag, 3. März 2008 14:42 schrieb Andreas Kurz: > (...) > > > > 1) write a script that measures the load of every node. Take the 5 min > > > average! See damping below for explanation. > > > > Or give the already included 'SysInfo

Re: [Linux-HA] external setting of node attributes?

2008-03-03 Thread Michael Schwartzkopff
Am Montag, 3. März 2008 14:42 schrieb Andreas Kurz: (...) > > 1) write a script that measures the load of every node. Take the 5 min > > average! See damping below for explanation. > > Or give the already included 'SysInfo' OCF RA a try. Sorry I forgot. This is even better. -- Dr. Michael Schwa

Re: [Linux-HA] external setting of node attributes?

2008-03-03 Thread Andreas Kurz
On Mon, Mar 3, 2008 at 2:39 PM, Michael Schwartzkopff <[EMAIL PROTECTED]> wrote: > Am Montag, 3. März 2008 14:19 schrieb [EMAIL PROTECTED]: > (...) > > > >> I am using HA-V2 and would like to implement a script or program > > >> which can be used for real load balancing. It should run every x >

Re: [Linux-HA] external setting of node attributes?

2008-03-03 Thread b52
> Am Montag, 3. März 2008 14:19 schrieb [EMAIL PROTECTED]: > (...) >> >> I am using HA-V2 and would like to implement a script or program >> >> which can be used for real load balancing. It should run every x >> >> seconds and determine the cpu+IO load of the host and change a >> >> regarding node

Re: [Linux-HA] external setting of node attributes?

2008-03-03 Thread Michael Schwartzkopff
Am Montag, 3. März 2008 14:19 schrieb [EMAIL PROTECTED]: (...) > >> I am using HA-V2 and would like to implement a script or program > >> which can be used for real load balancing. It should run every x > >> seconds and determine the cpu+IO load of the host and change a > >> regarding node attribut

Re: [Linux-HA] external setting of node attributes?

2008-03-03 Thread b52
> [EMAIL PROTECTED] wrote: >> Hi, >> >> I am using HA-V2 and would like to implement a script or program >> which can be used for real load balancing. It should run every x >> seconds and determine the cpu+IO load of the host and change a >> regarding node attribute. Then crm should use this attrib

Re: [Linux-HA] external setting of node attributes?

2008-03-03 Thread Luis Motta Campos
[EMAIL PROTECTED] wrote: > Hi, > > I am using HA-V2 and would like to implement a script or program > which can be used for real load balancing. It should run every x > seconds and determine the cpu+IO load of the host and change a > regarding node attribute. Then crm should use this attribute and

[Linux-HA] external setting of node attributes?

2008-03-03 Thread b52
Hi, I am using HA-V2 and would like to implement a script or program which can be used for real load balancing. It should run every x seconds and determine the cpu+IO load of the host and change a regarding node attribute. Then crm should use this attribute and a threshold in location constraints

[Linux-HA] Heartbeat reboots machine after "generic plugin load failed" error - please help

2008-03-03 Thread Luis Motta Campos
Hi Linux-HA list :) I'm running CentOS 5 (Linux 2.6.18-53.1.6.el5 #1 SMP Wed Jan 23 11:28:47 EST 2008 x86_64 GNU/Linux), and Heartbeat from the packages: heartbeat.x86_64 2.1.3-3.el5.centos heartbeat-pils.x86_642.1.3-3.el5.centos heartbeat-stonith.x

Re: [Linux-HA] HA 2.1.3 DRBD 8.2.4 and Jboss 4.0.3

2008-03-03 Thread Michael Schwartzkopff
Am Montag, 3. März 2008 11:51 schrieb Marco Leone: > Hi, > (...) > While I saw lot of mail and info about mysql (I'm reading through those > right now) I didn't find any specific information for jboss neither an > OCF agent in heartbeat distribution. > > I suppose I just need to write a simple agen

[Linux-HA] HA 2.1.3 DRBD 8.2.4 and Jboss 4.0.3

2008-03-03 Thread Marco Leone
Hi, I've set up a couple of 2 node cluster with ha 2.1.3 drbd 8.2.4 and Apache2. Those are working pretty good so I'd like to try to migrate some other clusters to the HA architecture. Next will be 2 nodes mysql and jboss clusters. While I saw lot of mail and info about mysql (I'm reading th

Re: [Linux-HA] How to allow resources to ping-pong forever?

2008-03-03 Thread Dominik Klein
Alex Spengler wrote: Hi, I'm stuck in setting up my cluster. What I want to achive is - run apache on whatever node together with cluster IP which is 172.23.100.200. - if apache fails -> switch over to other node - if gateway 172.23.100.1 is not reachable -> switch over to other node AND allow

[Linux-HA] How to allow resources to ping-pong forever?

2008-03-03 Thread Alex Spengler
Hi, I'm stuck in setting up my cluster. What I want to achive is - run apache on whatever node together with cluster IP which is 172.23.100.200. - if apache fails -> switch over to other node - if gateway 172.23.100.1 is not reachable -> switch over to other node AND allow unlimited number of swi

[Linux-HA] Simple HA cluster

2008-03-03 Thread Geoffroy ARNOUD
Hi all, I am setting up a simple HA cluster for mysql database (active/passive with DRBD). My 2 servers will have a direct network link (bonded) for DRBD replication and heartbeat communication. For database access, network links will also be redunded, toward 2 different switches. My question is: