Re: [Linux-ha-dev] [PATCH] SNMP subagent fixes for 2.1.3

2007-12-19 Thread Dejan Muhamedagic
Hi, On Wed, Dec 19, 2007 at 01:57:17PM +0900, Keisuke MORI wrote: Dejan, Thank you for commiting the SNMP extention to the repository, but Would you please include the following patches along with it? 1) SNMP: fix a problem on displaying an unmanaged group and to check it in

Re: [Linux-ha-dev] Re: Re: pgsql ra (WAS: Re: ids ra)

2007-12-19 Thread Dejan Muhamedagic
On Tue, Dec 18, 2007 at 11:14:05AM -0700, Serge Dubrouski wrote: Dejan - Did you cover this in the Mercurial? Just now. Thanks for the reminder. Dejan On Dec 16, 2007 6:20 PM, HIDEO YAMAUCHI [EMAIL PROTECTED] wrote: Hi, Please disregard this. I found where was the problem and here

[Linux-HA] Checklist

2007-12-19 Thread Jochen Lienhard
Hi, I'm looking for a checklist, so we can simulate different errors. Example: - failure with switch at node 1 take out the network wire out of the switch and look if node 2 takes the prozess Does such a list exist? There are so many failures that can happen, so that it would be nice to

Re: [Linux-HA] Heartbeat v2 CIB/API questions

2007-12-19 Thread Andrew Beekhof
On Dec 18, 2007, at 5:35 PM, Scott Mann wrote: On Tue 12/18/2007 12:52 AM, Andrew Beekhof said: On Dec 17, 2007, at 11:28 PM, Scott Mann wrote: in v2 mode you can't monitor the resource using the HA API... only via the CIB. Yes, right. Figured that out when my ha api call for

Re: [Linux-HA] Heartbeat Service fails in the first start.

2007-12-19 Thread Dejan Muhamedagic
Hi, On Wed, Dec 19, 2007 at 04:47:31PM +0900, HIDEO YAMAUCHI wrote: Hi, I installed a development version in the following procedures. 1)Initial installation of RHEL5(Update1). 2)Installation of libnet. 3)Installation of Heartbeat development version(Heartbeat-Dev-d739f7e38999).

Re: [Linux-HA] Checklist

2007-12-19 Thread Dejan Muhamedagic
Hi, On Wed, Dec 19, 2007 at 08:52:00AM +0100, Jochen Lienhard wrote: Hi, I'm looking for a checklist, so we can simulate different errors. Example: - failure with switch at node 1 take out the network wire out of the switch and look if node 2 takes the prozess Does such a list

Re: [Linux-HA] Heartbeat Service fails in the first start.

2007-12-19 Thread Andrew Beekhof
On Dec 19, 2007, at 8:47 AM, HIDEO YAMAUCHI wrote: Hi, I installed a development version in the following procedures. 1)Initial installation of RHEL5(Update1). 2)Installation of libnet. 3)Installation of Heartbeat development version(Heartbeat-Dev- d739f7e38999). 4)Setting such as ha.cf

Re: [Linux-HA] not migrating on cib.xml description

2007-12-19 Thread Andrew Beekhof
On Dec 18, 2007, at 5:49 AM, DAIKI MATSUDA wrote: Hi, Andrew. I am sorry for sending uncomplete information and attached the results of 'cibadmin -Ql' for commonn cib.xml files. So, could you confirm them? i'm confused why do you think the group should move? the rule rule id=loc1:rule1

Re: [Linux-HA] RA monitor interval

2007-12-19 Thread Andrew Beekhof
On Dec 18, 2007, at 6:28 AM, DAIKI MATSUDA wrote: Hi, Andrew. I am sorry for delay because I was not aware of requiring logs. And I attached the hb_report log. Best Regards MATSUDA, Daiki 2007/12/12, Andrew Beekhof [EMAIL PROTECTED]: On Dec 12, 2007, at 7:41 AM, DAIKI MATSUDA wrote: Hi,

[Linux-HA] Failover IP on dedicated interface

2007-12-19 Thread Anders Bruun Olsen
Hi, I am setting up a dedicated backend fileserver HA-cluster where I have one dedicated network interface per server that needs to access the data. This means that I am not really interrested in only doing failover of secondary IP-addresses (or aliases), but rather the primary IP-address of real

[Linux-HA] Heartbeat configuration older error

2007-12-19 Thread Fernando Iglesias
Hi, I've a bit strange problem with cibadmin, I should replace actual cib.xmlwith a new hand-edited cib.xml, so I executed: cibadmin -R -x /path/to/newfile/cib.xml, but next error is shown: Call cib_replace failed (-45): Update was older than existing configuration null But Im sure it's not

Re: [Linux-HA] Heartbeat configuration older error

2007-12-19 Thread Dejan Muhamedagic
Hi, On Wed, Dec 19, 2007 at 11:34:38AM +0100, Fernando Iglesias wrote: Hi, I've a bit strange problem with cibadmin, I should replace actual cib.xmlwith a new hand-edited cib.xml, so I executed: cibadmin -R -x /path/to/newfile/cib.xml, but next error is shown: Call cib_replace failed

Re: [Linux-HA] Heartbeat Service fails in the first start.

2007-12-19 Thread Lars Marowsky-Bree
On 2007-12-19T11:32:12, Andrew Beekhof [EMAIL PROTECTED] wrote: i prefer to use the crm respawn directive which disables the fast-fail logic^. when a non-transient problem like this occurs and heartbeat is started at boot time (which is the normal thing to do), you have about 2s to identify

Re: [Linux-HA] Failover IP on dedicated interface

2007-12-19 Thread Anders Bruun Olsen
Dejan Muhamedagic wrote: To sum up, I have eth2 dedicated to a specific IP on both cluster nodes and I want heartbeat to do failover for it. Is there a way to do this without ending up with eth2:0, with the current resource scripts? Why do you need to move the real interface addresses? What's

Re: [Linux-HA] Failover IP on dedicated interface

2007-12-19 Thread Dejan Muhamedagic
Hi, On Wed, Dec 19, 2007 at 01:09:00PM +0100, Anders Bruun Olsen wrote: Hi, I am setting up a dedicated backend fileserver HA-cluster where I have one dedicated network interface per server that needs to access the data. This means that I am not really interrested in only doing failover of

Re: [Linux-HA] Failover IP on dedicated interface

2007-12-19 Thread Anders Bruun Olsen
[EMAIL PROTECTED] wrote: By having 2 Ethernet ports on each machine, one for the heartbeat and one as an active interface. If you are using a crossover for the heartbeat, the active interface would allow you to access the machine when it was not your active node. I have eth0 connected to our

RE: [Linux-HA] Failover IP on dedicated interface

2007-12-19 Thread sliebman
By having 2 Ethernet ports on each machine, one for the heartbeat and one as an active interface. If you are using a crossover for the heartbeat, the active interface would allow you to access the machine when it was not your active node. Simon -Original Message- From: [EMAIL

RE: [Linux-HA] Failover IP on dedicated interface

2007-12-19 Thread sliebman
What we've done is set up our machines with crossover Ethernet on Eth1 for synch and has IP addresses for our machines on Eth0 and a floating IP address for failover for our applications. This has been working in our production environment for our front end (API) boxes on a MySQL cluster

Re: [Linux-HA] Heartbeat Service fails in the first start.

2007-12-19 Thread Andrew Beekhof
On Dec 19, 2007, at 2:13 PM, Lars Marowsky-Bree wrote: On 2007-12-19T11:32:12, Andrew Beekhof [EMAIL PROTECTED] wrote: i prefer to use the crm respawn directive which disables the fast- fail logic^. when a non-transient problem like this occurs and heartbeat is started at boot time (which

[Linux-HA] segfault when trying to put date_spec/date_expression

2007-12-19 Thread Cousin Marc
Hi, I'm trying to put 2 sets of rules for stickiness and I'm having crm_mon and crmd segfaulting if I try to put them. I've narrowed it down to putting this rule : cibadmin -C -o crm_config -X ' cluster_property_set id=heures_maintenance score=100 rule id=cluster:heures_nuit boolean_op=or

Re: [Linux-HA] segfault when trying to put date_spec/date_expression

2007-12-19 Thread Andrew Beekhof
On Dec 19, 2007, at 3:10 PM, Cousin Marc wrote: Hi, I'm trying to put 2 sets of rules for stickiness and I'm having crm_mon and crmd segfaulting if I try to put them. I've narrowed it down to putting this rule : cibadmin -C -o crm_config -X ' cluster_property_set id=heures_maintenance

Re: [Linux-HA] segfault when trying to put date_spec/date_expression

2007-12-19 Thread Cousin Marc
sorry, I knew I was forgetting something :) Version : 2.1.2+hg.11310.702e4f418ca8-2 (debian sid) The bt from core dump for crm_mon (it's not compiled with debug, I guess...): (gdb) bt full #0 0xb7e4d35c in cron_range_satisfied () from /usr/lib/libpe_status.so.1 No symbol table info available.

Re: [Linux-HA] segfault when trying to put date_spec/date_expression

2007-12-19 Thread Andrew Beekhof
On Dec 19, 2007, at 4:09 PM, Cousin Marc wrote: sorry, I knew I was forgetting something :) Version : 2.1.2+hg.11310.702e4f418ca8-2 (debian sid) The bt from core dump for crm_mon (it's not compiled with debug, I guess...): nod :-( but at least i have the exact version number - it's

Re: [Linux-HA] segfault when trying to put date_spec/date_expression

2007-12-19 Thread Andrew Beekhof
On Dec 19, 2007, at 4:09 PM, Cousin Marc wrote: sorry, I knew I was forgetting something :) Version : 2.1.2+hg.11310.702e4f418ca8-2 (debian sid) Ah, this was fixed back in October http://hg.beekhof.net/lha/crm-dev/rev/64fd1fb9d725 If it helps, the patch is included in the latest interim

Re: [Linux-HA] not migrating on cib.xml description

2007-12-19 Thread DAIKI MATSUDA
2007/12/19, Andrew Beekhof [EMAIL PROTECTED]: On Dec 18, 2007, at 5:49 AM, DAIKI MATSUDA wrote: Hi, Andrew. I am sorry for sending uncomplete information and attached the results of 'cibadmin -Ql' for commonn cib.xml files. So, could you confirm them? i'm confused why do you

Re: [Linux-HA] RA monitor interval

2007-12-19 Thread DAIKI MATSUDA
2007/12/19, Andrew Beekhof [EMAIL PROTECTED]: On Dec 18, 2007, at 6:28 AM, DAIKI MATSUDA wrote: Hi, Andrew. I am sorry for delay because I was not aware of requiring logs. And I attached the hb_report log. Best Regards MATSUDA, Daiki 2007/12/12, Andrew Beekhof [EMAIL

Re: [Linux-HA] dev can not up a fail count for monitor timeout

2007-12-19 Thread Alan Robertson
Junko IKEDA wrote: Thanks a lot! Would you take in this fix to 2.1.3? I hope it's not too late. OK... It's a regression - and I don't think our tests catch this case, so I doubt it will affect the outcome of tests. Nevertheless... Dave, Dejan and (if possible) Lars: I have put this patch

[Linux-HA] print OFFLINE(standby) status using crm_mon

2007-12-19 Thread Junko IKEDA
Hi, While I check crm_mon display, it keeps node's status as standby after that node was shutdown. CIB status is here; node_state uname=prec370e crmd=offline in_ccm=false ha=dead join=down id=9d9ca527-cea9-470c-9e03-e49fe5630bba shutdown=0 expected=down