[Linux-HA] ping_group and ipfail...?

2008-06-02 Thread Erich Focht
Hello, I'm trying to get a pair of servers failover when one of their infiniband interfaces fails. I hoped to get this functionality with ping_groups and added to ha.cf the lines: respawn hacluster /usr/lib64/heartbeat/ipfail ping_group ibping 10.3.0.229 10.3.0.230 10.3.0.231 Now when I unplug t

[Linux-HA] Fedora RPM failure of Heartbeat 2.1.3

2008-06-02 Thread Achim Stumpf
Hi, have installed Heartbeat 2.1.3-22.1 on FC7 from the rpm's found on http://download.opensuse.org/repositories/server:/ha-clustering/Fedora_7/x86_64/ heartbeat[1304]: 2008/06/02_13:26:48 info: Enabling logging daemon heartbeat[1304]: 2008/06/02_13:26:48 info: logfile and debug file are those

Re: [Linux-HA] ping_group and ipfail...?

2008-06-02 Thread Erich Focht
Hello Michael, thanks for the reply. With one ping_group things now work fine. Maybe somebody could update the wiki page for ping_group and point to pingd instead of ipfail for v2... I'm still having an issue: as said, with one ping_group things work as expected now (I added rsc colocation rules

Re: [Linux-HA] cl_status

2008-06-02 Thread Achim Stumpf
Andrew Beekhof schrieb: On Sat, May 31, 2008 at 11:55 AM, Achim Stumpf <[EMAIL PROTECTED]> wrote: Hi, I have installed Heartbeat 2.1.3-22.1 on FC7 from the rpm's found on http://download.opensuse.org/repositories/server:/ha-clustering/Fedora_7/x86_64/ Everything runs fine, but I can't use that

[Linux-HA] Error bringing up an eth device with heartbeat

2008-06-02 Thread Wayne Gemmell
Hi all I need eth2 to have a static address that need to fail over between the servers. This is quite trivial but as the link isn't set to up the IPaddr script fails. As a work around I'm putting "/sbin/ip set link up dev eth2" in my startup scripts but I'm wondering if this is being looked at

Re: [Linux-HA] Try to configure a 3 node cluster

2008-06-02 Thread Andrew Beekhof
On Mon, May 26, 2008 at 2:24 PM, Florin <[EMAIL PROTECTED]> wrote: > Andrew Beekhof schrieb: >>> >>> Hi, >>> I changed the line from: >>> -- >>> >> score="-INFINITY"/> >>> -- >>> to: >>> -- >>> >> score="-INFINITY"/> >>> -- >>> and now it works, but I dont understand

Re: [Linux-HA] Error bringing up an eth device with heartbeat

2008-06-02 Thread Dejan Muhamedagic
Hi, On Mon, Jun 02, 2008 at 12:02:12PM +0200, Wayne Gemmell wrote: > Hi all > > I need eth2 to have a static address that need to fail over between the When you say "static address" what do you mean exactly? Normally, all addresses within the cluster should be static. > servers. This is quite

Re: [Linux-HA] the stop sequence for group resource

2008-06-02 Thread Andrew Beekhof
On Mon, Jun 2, 2008 at 8:51 AM, Junko IKEDA <[EMAIL PROTECTED]> wrote: > Hi, > > This is a question about the stop sequence for a group resource. > We have two nodes and six resources in one group. > > # crm_mon -1 > Node: node-b (db8f2da4-a7fb-40bf-bf14-befe4af11db7): online > Node: node-a (8029f8

Re: [Linux-HA] ping_group and ipfail...?

2008-06-02 Thread Michael Schwartzkopff
Am Montag, 2. Juni 2008 09:59 schrieb Erich Focht: > Hello, > > I'm trying to get a pair of servers failover when one of their infiniband > interfaces fails. I hoped to get this functionality with ping_groups and > added to ha.cf the lines: > > respawn hacluster /usr/lib64/heartbeat/ipfail > ping_g

Re: [Linux-HA] Fedora RPM failure of Heartbeat 2.1.3

2008-06-02 Thread Dejan Muhamedagic
Hi, On Mon, Jun 02, 2008 at 01:33:17PM +0200, Achim Stumpf wrote: > Hi, > > have installed Heartbeat 2.1.3-22.1 on FC7 from the rpm's found on > http://download.opensuse.org/repositories/server:/ha-clustering/Fedora_7/x86_64/ > > heartbeat[1304]: 2008/06/02_13:26:48 info: Enabling logging daemon >

Re: [Linux-HA] Fedora RPM failure of Heartbeat 2.1.3

2008-06-02 Thread Andrew Beekhof
On Mon, Jun 2, 2008 at 2:05 PM, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote: > Hi, > > On Mon, Jun 02, 2008 at 01:33:17PM +0200, Achim Stumpf wrote: >> Hi, >> >> have installed Heartbeat 2.1.3-22.1 on FC7 from the rpm's found on >> http://download.opensuse.org/repositories/server:/ha-clustering/Fed

Re: [Linux-HA] Error bringing up an eth device with heartbeat

2008-06-02 Thread Wayne Gemmell
On Monday 02 June 2008 13:21:05 Dejan Muhamedagic wrote: > > I need eth2 to have a static address that need to fail over between the > > When you say "static address" what do you mean exactly? Normally, > all addresses within the cluster should be static. I'll explain it a bit better. I want heartb

Re: [Linux-HA] Silly question (maybe) about hostnames and heartbeat

2008-06-02 Thread Rubin Bennett
On Sun, 2008-06-01 at 10:10 +0200, Lars Ellenberg wrote: > On Sat, May 31, 2008 at 03:59:12PM -0400, Rubin Bennett wrote: > > Hello all! > > > > After lurking for a long time on this list, I have a question about a > > failover pair I've been tasked with building up. > > > > It's a super straight

RE: [Linux-HA] heartbeat from SLES10 SP2

2008-06-02 Thread Hildebrand, Nils, 232
Hi, after upgrading from SP1 to SP2 the ucast-heartbeat connection of a two-node-cluster does not work any longer. The cluster is a combination of xen and heartbeat. XEN and Xen-Network are being startet before heartbeat starts up. Componentes before: 2 x SLES 10 SP1: Heartbeat 2.0.8, XEN 3.0.4

[Linux-HA] Question about bcast and ucast

2008-06-02 Thread Achim Stumpf
Hi, I want to configure only ucast for heartbeat between the two nodes. In ha.cf I got: ucast eth0 10.11.20.221 ucast eth0 10.11.20.222 ucast eth1 10.11.20.223 ucast eth1 10.11.20.224 In the logs I see: Jun 2 14:53:45 isintra5 heartbeat: [4767]: info: glib: ucast: write socket priority set

[Linux-HA] status of dopd in 2.1.3

2008-06-02 Thread Heiko Weier
Hi *, i am new to this list and currently testing a 2-node cluster with heartbeat in vmware-server. With dopd + fencing resource-only the cluster does not what he is supposed to do. In case of a complete network loss on nodeB, nodeA will not takeover the resources because of dopd can not outdate

[Linux-HA] SCSI Reservation OCF Agent ?

2008-06-02 Thread Robert Heinzmann (ml)
Hi, besides STONITH another (as I would say additional) approach in shared scsi clusters is to use scsi reservations (2 or 3) to reserve a disk for a host and avoid data integrity problems. A situation were SCSI reservations are useful is the automatic mounting of a cluster protected fs. If

Re: [Linux-HA] status of dopd in 2.1.3

2008-06-02 Thread Lars Ellenberg
On Mon, Jun 02, 2008 at 03:18:39PM +0200, Heiko Weier wrote: > Hi *, > > i am new to this list and currently testing a 2-node cluster with > heartbeat in vmware-server. > > With dopd + fencing resource-only the cluster does not what he is > supposed to do. > > In case of a complete network loss

Re: [Linux-HA] Error bringing up an eth device with heartbeat

2008-06-02 Thread Dejan Muhamedagic
Hi, On Mon, Jun 02, 2008 at 01:49:55PM +0200, Wayne Gemmell wrote: > On Monday 02 June 2008 13:21:05 Dejan Muhamedagic wrote: > > > I need eth2 to have a static address that need to fail over between the > > > > When you say "static address" what do you mean exactly? Normally, > > all addresses wi

Re: [Linux-HA] Question about bcast and ucast

2008-06-02 Thread Dejan Muhamedagic
Hi, On Mon, Jun 02, 2008 at 03:03:18PM +0200, Achim Stumpf wrote: > Hi, > > I want to configure only ucast for heartbeat between the two nodes. In > ha.cf I got: > > ucast eth0 10.11.20.221 > ucast eth0 10.11.20.222 > ucast eth1 10.11.20.223 > ucast eth1 10.11.20.224 > > In the logs I see: > > Ju

Re: [Linux-HA] SCSI Reservation OCF Agent ?

2008-06-02 Thread Dejan Muhamedagic
Hi, On Mon, Jun 02, 2008 at 04:38:04PM +0200, Robert Heinzmann (ml) wrote: > Hi, > > besides STONITH another (as I would say additional) approach in shared scsi > clusters is to use scsi reservations (2 or 3) to reserve a disk for a host > and avoid data integrity problems. > > A situation were

RE: [Linux-HA] heartbeat from SLES10 SP2

2008-06-02 Thread Hildebrand, Nils, 232
Hi, I found the cause of that error: > [...] > Now heartbeat says that only BRIDGE2 is up - whereas before > eth0, eth1 and eth5 were up. The xen 3.2.0 used in conjunction with SLES 10 SP2 now issues "ifup BRIDGE" after moving the physical network device to the bridge. Thus I supplied the scr

Re: [Linux-HA] heartbeat from SLES10 SP2

2008-06-02 Thread Dejan Muhamedagic
Hi, On Mon, Jun 02, 2008 at 02:27:52PM +0200, Hildebrand, Nils, 232 wrote: > Hi, > > after upgrading from SP1 to SP2 the ucast-heartbeat connection of a > two-node-cluster does not work any longer. > > The cluster is a combination of xen and heartbeat. XEN and Xen-Network > are being startet bef

[Linux-HA] How to create stonith devices using DRAC4?

2008-06-02 Thread Rob Aronson
Can anyone give me some guidance on the best way to do stonith using DRAC4 as my stonith device? I've used wti switches before, with that I create clone resources for each node. I can't come up with what I should create when I have 4 nodes, each with an address that needs to be controlled and turne

Re: [Linux-HA] SCSI Reservation OCF Agent ?

2008-06-02 Thread Greg Freemyer
On Mon, Jun 2, 2008 at 10:38 AM, Robert Heinzmann (ml) <[EMAIL PROTECTED]> wrote: > Hi, > > besides STONITH another (as I would say additional) approach in shared scsi > clusters is to use scsi reservations (2 or 3) to reserve a disk for a host > and avoid data integrity problems. > > A situation

[Linux-HA] Starting non-idle resources

2008-06-02 Thread Nuno Covas
Greetings, I have a Heartbeat implementation that may need to deal with non-idle resources at startup and be able to stop them in case they are already being served. It is not critical that both machines cannot serve the same resource for a limited amount of time, but it is bad if heartbeat doesn'

Re: [Linux-HA] SCSI Reservation OCF Agent ?

2008-06-02 Thread Dejan Muhamedagic
Hi, On Mon, Jun 02, 2008 at 02:00:14PM -0400, Greg Freemyer wrote: > On Mon, Jun 2, 2008 at 10:38 AM, Robert Heinzmann (ml) <[EMAIL PROTECTED]> > wrote: > > Hi, > > > > besides STONITH another (as I would say additional) approach in shared scsi > > clusters is to use scsi reservations (2 or 3) to

Re: [Linux-HA] How to create stonith devices using DRAC4?

2008-06-02 Thread Dejan Muhamedagic
Hi, On Mon, Jun 02, 2008 at 08:33:34AM -0700, Rob Aronson wrote: > Can anyone give me some guidance on the best way to do stonith using DRAC4 > as my stonith device? I've used wti switches before, with that I create > clone resources for each node. I can't come up with what I should create > when

Re: [Linux-HA] Starting non-idle resources

2008-06-02 Thread Lars Marowsky-Bree
On 2008-06-02T16:56:29, Nuno Covas <[EMAIL PROTECTED]> wrote: > Greetings, > > I have a Heartbeat implementation that may need to deal with non-idle > resources at startup and be able to stop them in case they are already > being served. It is not critical that both machines cannot serve the > sa

Re: [Linux-HA] SCSI Reservation OCF Agent ?

2008-06-02 Thread Greg Freemyer
On Mon, Jun 2, 2008 at 2:41 PM, Dejan Muhamedagic <[EMAIL PROTECTED]> wrote: > Hi, > > On Mon, Jun 02, 2008 at 02:00:14PM -0400, Greg Freemyer wrote: >> On Mon, Jun 2, 2008 at 10:38 AM, Robert Heinzmann (ml) <[EMAIL PROTECTED]> >> wrote: >> > Hi, >> > >> > besides STONITH another (as I would say a

[Linux-HA] How to create a rsc_order 'OR' condition

2008-06-02 Thread Alex Strachan
State1 == nodeA resourceA --- resourceC nodeB resourceB State2 == nodeA resourceA nodeB resourceB --- resourceC Rule: ResourceC can only start AFTER (resourceA or resourceB), a preference for resourceA is needed. Attempted config

RE: [Linux-HA] SCSI Reservation OCF Agent ?

2008-06-02 Thread Junko IKEDA
> On Mon, Jun 02, 2008 at 04:38:04PM +0200, Robert Heinzmann (ml) wrote: > > Hi, > > > > besides STONITH another (as I would say additional) approach in shared scsi > > clusters is to use scsi reservations (2 or 3) to reserve a disk for a host > > and avoid data integrity problems. > > > > A situat

[Linux-HA] Not started heartbeat!

2008-06-02 Thread Nguyen Quang Huy
Hi all! I'm setup heartbeat 2.1.3 on Fedora 7 (node1) and CENTOS 5.0 (node 2) I'm setup node1 Fedora File ha.cf use_logd yes bcast eth0 node ho-fileserver.ho.vpb.com.vn file-server.ho.vpb.com.vn crm on File Authokeys auth 1 1 sha1 hugo File logd.cf logf

RE: [Linux-HA] Not started heartbeat!

2008-06-02 Thread Alex Strachan
Confirm the output from 'uname -n' on both nodes matches what is in ha.cf > -Original Message- > From: [EMAIL PROTECTED] [mailto:linux-ha- > [EMAIL PROTECTED] On Behalf Of Nguyen Quang Huy > Sent: Tuesday, 3 June 2008 11:59 AM > To: linux-ha@lists.linux-ha.org > Subject: [Linux-HA] Not s

Re: [Linux-HA] How to create a rsc_order 'OR' condition

2008-06-02 Thread Andrew Beekhof
On Tue, Jun 3, 2008 at 2:09 AM, Alex Strachan <[EMAIL PROTECTED]> wrote: > State1 > > == > > nodeA resourceA --- resourceC > > nodeB resourceB > > > State2 > > == > > nodeA resourceA > > nodeB resourceB --- resourceC > > > > Rule: ResourceC can only start AFTER (resourc

RE: [Linux-HA] the stop sequence for group resource

2008-06-02 Thread Junko IKEDA
> Then (because of the probe) we find out it _is_ running afterall and > we end up in the situation contained in pe-input-6.bz2 > > We only guarantee that the probe for rscX completes before we start the rscX. Start failures which are set as on_fail=block also induce the unmanaged status as the s