[Linux-ha-dev] [patch 0/7] Various ldirectord fixes

2007-07-03 Thread horms
Hi, these patches fix various problems that Tuomo Soini reported to me, and a few more that I found along the way. -- Horms H: http://www.vergenet.net/~horms/ W: http://www.valinux.co.jp/en/ ___ Linux-HA-Dev:

[Linux-ha-dev] [patch 5/7] Tidy up fallback

2007-07-03 Thread horms
* Use the same parsing sequence for per-virtual and global fallback * Cope with the fallback not having a port specified by using the port of the virtual service * Split the server feild up into server (ip) and port, and also add a weight feild so that when its passed to get_real_id_str() it

[Linux-ha-dev] [patch 7/7] Fix bogus detection of combined check

2007-07-03 Thread horms
A combined check is denoted by $v-{checktype} = combined not $v-{combined} = negotiate Thanks to Tuomo Soini for spotting this Signed-off-by: Simon Horman [EMAIL PROTECTED] Index: heartbeat/ldirectord/ldirectord.in === ---

[Linux-ha-dev] [patch 4/7] Use an alarm for HTTPS timeouts

2007-07-03 Thread horms
LWP doesn't honour timeouts for HTTPS, so use an alarm instead This should close Bugilla Bug #1609 http://old.linux-foundation.org/developer_bugzilla/show_bug.cgi?id=1609 Signed-off-by: Simon Horman [EMAIL PROTECTED] Index: heartbeat/ldirectord/ldirectord.in

Re: [Linux-HA] Interim heartbeat packages refreshed

2007-07-03 Thread Andrew Beekhof
On 7/2/07, Alex Litvak [EMAIL PROTECTED] wrote: Thank You for your help it worked well now. Everything installed now there were few messages though, perhaps it can be usable. I did rpm -Uhv to install over the packages built with previous spec. /var/tmp/rpm-tmp.28673: line 1: fg: no job

Re: [Linux-HA] stonith setup

2007-07-03 Thread Jure Pečar
On Thu, 10 May 2007 10:09:06 +0100 Peter Clapham [EMAIL PROTECTED] wrote: Jure Pečar wrote: Turns out I had a wrong ilo password in the config file. Doh ... Stonith in this case says device not accessible, which is a bit misleading. It is perfectly accessible, it just doesn't accept our

[Linux-HA] heartbeat crm drdb8.

2007-07-03 Thread Robert Lindgren
Hi, I have some trouble finding resources on how to configure heartbeat2 with crm on. Either with the gui or the commandline variants. Main problem looks to be the ocf for drbd which doesn't work with drbd 8, and I can't find a newer version in cvs either? Hopefully I'm totally wrong and there

[Linux-HA] Help understand an incident

2007-07-03 Thread Peter Kruse
Hello list! today in one of our clusters a failover occured. Good news: it succeeded. But... while looking through the logs we found that messages are missing on one node so we can not say exactly what happened. Attached is the syslog from node-2 from the time where there are no messages on

[Linux-HA] About the check on the crm_verify command

2007-07-03 Thread YAMAUCHI HIDEO
Hi. The following settings were checked by the crm_verify command. (example) 1)op id=... interval=10a ../ 2)op id=... interval= 3)op id=... interval=-10s ../ However, the error is not found by the command. (crm_verify -x cib.xml) Moreover, the development

[Linux-HA] heartbeat 2.07 cannot stop drbddisk

2007-07-03 Thread cosmih
hi, i have two machine with gentoo installed with heartbeat 2.0.7, drbd 8.0.4and mon this setup is made for having a failover environment for a web application ( apache2.2 + php + mysql) when mon stop the heartbeat or when i stop the heartbeat the machine is restarted because the drbd device

Re: [Linux-HA] Help understand an incident

2007-07-03 Thread Andrew Beekhof
On 7/3/07, Peter Kruse [EMAIL PROTECTED] wrote: Hello list! today in one of our clusters a failover occured. Good news: it succeeded. But... while looking through the logs we found that messages are missing on one node so we can not say exactly what happened. Attached is the syslog from

[Linux-HA] Help with heartbeat setup!!

2007-07-03 Thread Ramsurrun Visham
Hi to all, I wanted to know if the following is possible with Heartbeat: I have 5 Pcs - 3 active (master) nodes and 2 standby (backup) nodes, all connected to a switch. The 2 standby nodes have to provide failover to any of active nodes in case one fails. Then if any of the active nodes fails

Re: [Linux-HA] DRBD 0.7 and Heartbeat 2

2007-07-03 Thread Andrew Beekhof
On 7/2/07, Adrian Overbury [EMAIL PROTECTED] wrote: Hi list-denizens. I think I've finally gotten my head around CRM/CIB. The whole deal of cluster resources, then ordering and co-location all makes sense now. What I'm having trouble with is trying to figure out what it is in my CIB that I've

Re: [Linux-HA] Help understand an incident

2007-07-03 Thread Lars Marowsky-Bree
On 2007-07-03T17:15:08, Andrew Beekhof [EMAIL PROTECTED] wrote: if it was just resource actions - then yes. they'll all be recorded in the CIB and produce updates like the one below. look out for failing monitors which probably triggered everything. And in particular, all of this is

Re: [Linux-HA] heartbeat 2.07 cannot stop drbddisk

2007-07-03 Thread Lars Marowsky-Bree
On 2007-07-03T15:53:01, cosmih [EMAIL PROTECTED] wrote: when mon stop the heartbeat or when i stop the heartbeat the machine is restarted because the drbd device cannot be setted in secondary mode below are file config for heartbeat and for drbd Find out which process is keeping open the

Re: [Linux-HA] Late heartbeats with heartbeat 2.0.8

2007-07-03 Thread Lars Marowsky-Bree
On 2007-07-03T11:51:18, Matt Wilder [EMAIL PROTECTED] wrote: my ha.cf: bcast em0 logfacility local7 As a first guess, use the logging daemon by setting use_logd yes, to isolate heartbeat from logging being slow. Regards, Lars -- Teamlead Kernel, SuSE Labs, Research and Development

Re: [Linux-HA] About the update of the timeout of the monitor

2007-07-03 Thread Lars Marowsky-Bree
On 2007-07-02T22:56:26, YAMAUCHI HIDEO [EMAIL PROTECTED] wrote: A log and an configuration file were appended. Hi, the configuration file is not very useful though; we never write the cluster status to disk. Please instead at least use the output from cibadmin, or even better, also attach the

Re: [Linux-HA] About the check on the crm_verify command

2007-07-03 Thread Lars Marowsky-Bree
On 2007-07-03T21:50:11, YAMAUCHI HIDEO [EMAIL PROTECTED] wrote: Hi. The following settings were checked by the crm_verify command. (example) 1)op id=... interval=10a ../ 2)op id=... interval= 3)op id=... interval=-10s ../ However, the error is not found by the

[Linux-HA] Recovery after a failure

2007-07-03 Thread Adrian Chapela
Hello, I'm testing Heartbeat 2 to make a MySQL master/slave system to takeover a failure. Now I can test a service, test the network connectivity, etc and all runs OK. At this moment, I want the next thing: Server A execute MySQL resource and IPADDR (the VIP shared address) Server B execute

Re: [Linux-HA] Late heartbeats with heartbeat 2.0.8

2007-07-03 Thread Matt Wilder
I have updated my configuration to use logd and will report back with the results. Thanks, Matt On 7/3/07, Lars Marowsky-Bree [EMAIL PROTECTED] wrote: On 2007-07-03T11:51:18, Matt Wilder [EMAIL PROTECTED] wrote: my ha.cf: bcast em0 logfacility local7 As a first guess, use the logging

Re: [Linux-HA] About the check on the crm_verify command

2007-07-03 Thread 山内 英生
Hi. Please file a bug. I registered bug into bugzila.(#1628) Regard, Yamauchi --- Lars Marowsky-Bree [EMAIL PROTECTED] wrote: On 2007-07-03T21:50:11, YAMAUCHI HIDEO [EMAIL PROTECTED] wrote: Hi. The following settings were checked by the crm_verify command. (example) 1)op