Re: [Linux-HA] How to remove nodes with hb_gui

2009-08-05 Thread Yan Gao
On Wed, 2009-08-05 at 20:42 -0400, Bernie Wu wrote:
> Hi Listers,
> How can I remove nodes that currently appear in my Linux HA Management Client 
> ?
If it's heartbeat based cluster, first you should run hb_delnode to
delete the nodes.

And then delete them from cib: 
If you are using the latest cluster stack, you could either delete them
via the GUI if you have pacemaker-mgmt installed, Or run "crm node
delete ...".
If you are still using heartbeat-2.1, you have to run cibadmin to delete
them.

>   These nodes belong to another cluster and they appear as stopped.
> 
> TIA
> Bernie
> 
> 
> The information contained in this e-mail message is intended only for the 
> personal and confidential use of the recipient(s) named above. This message 
> may be an attorney-client communication and/or work product and as such is 
> privileged and confidential. If the reader of this message is not the 
> intended recipient or an agent responsible for delivering it to the intended 
> recipient, you are hereby notified that you have received this document in 
> error and that any review, dissemination, distribution, or copying of this 
> message is strictly prohibited. If you have received this communication in 
> error, please notify us immediately by e-mail, and delete the original 
> message.
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
-- 
Regards,
Yan Gao
China R&D Software Engineer
y...@novell.com

Novell, Inc.
Making IT Work As One?6?4

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

[Linux-HA] How to remove nodes with hb_gui

2009-08-05 Thread Bernie Wu
Hi Listers,
How can I remove nodes that currently appear in my Linux HA Management Client ? 
 These nodes belong to another cluster and they appear as stopped.

TIA
Bernie


The information contained in this e-mail message is intended only for the 
personal and confidential use of the recipient(s) named above. This message may 
be an attorney-client communication and/or work product and as such is 
privileged and confidential. If the reader of this message is not the intended 
recipient or an agent responsible for delivering it to the intended recipient, 
you are hereby notified that you have received this document in error and that 
any review, dissemination, distribution, or copying of this message is strictly 
prohibited. If you have received this communication in error, please notify us 
immediately by e-mail, and delete the original message.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Try #2 on DRBD

2009-08-05 Thread Robert L. Harris


I actually found exactly what you said 5 minutes after I sent the 
email.  Would have been
great for backing up the filesystem.

I have heartbeat up and running for a 3rd IP and found a bit to mount it:

grandpa IPaddr::192.168.0.243/24/eth0 
Filesystem::/dev/drbd0::/data::xfs::defaults

Is that all I need to have the backup machine (grandma) mount /data when 
grandpa fails?

Robert


On 8/5/09 4:56 PM, Brian R. Hellman wrote:
> There is no way to mount a secondary device without first promoting it
> to primary.
> If what you're looking to do is have one server primary with a read-only
> secondary, it is not possible.
>
> Your secondary servers 'cat /proc/drbd' should look similar with the
> exception of Primary/Secondary are reversed.
>
> Brian
>
> Robert L. Harris wrote:
>
>> I have remade the system using single primary since that's all that
>> is needed in reality.
>> At current I have this on the primary:
>>
>> r...@grandpa:~# cat /proc/drbd
>> version: 8.3.0 (api:88/proto:86-89)
>> GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by
>> iv...@ubuntu, 2009-01-17 07:49:56
>>0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---
>>   ns:825415525 nr:0 dw:133509 dr:825282384 al:37 bm:50372 lo:0 pe:0
>> ua:0 ap:0 ep:1 wo:b oos:0
>>
>> I get the same off the secondary system as well.  When I try to mount
>> the xfs filesystem I am
>> getting this:
>>
>> r...@grandma:~# mount -t xfs -o ro /dev/drbd0 /data/
>> mount: Wrong medium type
>>
>> I am looking but don't see a doc that says how to mount the image read
>> only on the secondary
>> machine.
>>
>> Robert
>>
>>
>>  
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>

-- 

:wq!

Robert L. Harris | GPG Key ID: E344DA3B
  @ x-hkp://pgp.mit.edu
DISCLAIMER:
   These are MY OPINIONS With Dreams To Be A King,
ALONE.  I speak for  First One Should Be A Man
no-one else.   - Manowar


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Try #2 on DRBD

2009-08-05 Thread Brian R. Hellman
There is no way to mount a secondary device without first promoting it
to primary.
If what you're looking to do is have one server primary with a read-only
secondary, it is not possible.

Your secondary servers 'cat /proc/drbd' should look similar with the
exception of Primary/Secondary are reversed.

Brian

Robert L. Harris wrote:
>I have remade the system using single primary since that's all that 
> is needed in reality.
> At current I have this on the primary:
>
> r...@grandpa:~# cat /proc/drbd
> version: 8.3.0 (api:88/proto:86-89)
> GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by 
> iv...@ubuntu, 2009-01-17 07:49:56
>   0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---
>  ns:825415525 nr:0 dw:133509 dr:825282384 al:37 bm:50372 lo:0 pe:0 
> ua:0 ap:0 ep:1 wo:b oos:0
>
> I get the same off the secondary system as well.  When I try to mount 
> the xfs filesystem I am
> getting this:
>
> r...@grandma:~# mount -t xfs -o ro /dev/drbd0 /data/
> mount: Wrong medium type
>
> I am looking but don't see a doc that says how to mount the image read 
> only on the secondary
> machine.
>
> Robert
>
>   
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Try #2 on DRBD

2009-08-05 Thread Robert L. Harris

   I have remade the system using single primary since that's all that 
is needed in reality.
At current I have this on the primary:

r...@grandpa:~# cat /proc/drbd
version: 8.3.0 (api:88/proto:86-89)
GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by 
iv...@ubuntu, 2009-01-17 07:49:56
  0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---
 ns:825415525 nr:0 dw:133509 dr:825282384 al:37 bm:50372 lo:0 pe:0 
ua:0 ap:0 ep:1 wo:b oos:0

I get the same off the secondary system as well.  When I try to mount 
the xfs filesystem I am
getting this:

r...@grandma:~# mount -t xfs -o ro /dev/drbd0 /data/
mount: Wrong medium type

I am looking but don't see a doc that says how to mount the image read 
only on the secondary
machine.

Robert

-- 

:wq!

Robert L. Harris | GPG Key ID: E344DA3B
  @ x-hkp://pgp.mit.edu
DISCLAIMER:
   These are MY OPINIONS With Dreams To Be A King,
ALONE.  I speak for  First One Should Be A Man
no-one else.   - Manowar


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Getting heartbeat to report current node

2009-08-05 Thread Cantwell, Bryan
I am not using crm, because I find that it does not work reliably, so I am 
using the 2.0.8 in v1 mode essentially.
I use monit to detect  status of my services and issue restart or hb_standby if 
needed.

Is 2.0.8 not a good version? It was the latest version I saw available in rpm 
for my redhat ...



-Original Message-
From: linux-ha-boun...@lists.linux-ha.org 
[mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Andrew Beekhof
Sent: Wednesday, August 05, 2009 1:35 AM
To: General Linux-HA mailing list
Subject: Re: [Linux-HA] Getting heartbeat to report current node

On Tue, Aug 4, 2009 at 6:16 PM, Cantwell, Bryan wrote:
> I am running heartbeat 2.0.8 on linux.

yikes!

> I'm building a web interface to show information about my cluster. What 
> command can I use to ask heartbeat which is the node that is currently active?

Depends which resource manager you're using.
If your resources are configured in cib.xml, run crm_mon.
If you're using haresources... dunno
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] probable bug with the OCF drbd script

2009-08-05 Thread Marc Cousin
On Wednesday 05 August 2009 15:46:26 Lars Ellenberg wrote:
> On Wed, Aug 05, 2009 at 01:51:03PM +0200, Marc Cousin wrote:
> > Hi,
> >
> > I know this is a very old thread.
> >
> > But I'm now trying heartbeat 2.99.2 (from the provided rpm packages) and
> > I still have these 'local's in the drbd script.
>
> Please use DRBD 8.3.2 (or newer, in case someone digs up this thread in
> six month again), which includes ocf/linbit/drbd RA.
> usage is "compatible" to the old ocf/heartbeat/drbd one,
> and documented in the DRBD User's Guide.

I will do that. But if the other is obsolete, shouldn't it be removed ?
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Pacemaker 1.4 & HBv2 1.99 // About quorum choice (contd.)

2009-08-05 Thread Dominik Klein
Alain.Moulle wrote:
> Thanks Andrew,
> 
> 1. So my understanding is that in a "more than 2 nodes cluster" , if
> two nodes are failed, the have_quorum is set to 0 by the cluster soft
> and the behavior is choosen by the administrator with the no-quorum-policy
> parameter. So the question is now : what is the best choice for 
> no-quorum-policy
> value ? My feeling is that "ignore" would be the best choice if all services
> can run without problems on the remaining healthy nodes.

That's not the only case this can happen. If you run into split-brain,
each node may be healthy but the network connections may be broken. With
"ignore", you will end up with resources running multiple times. That's
a problem sometimes ;)

Don't use ignore in >2 node clusters.

> "suicide" or "stop" : my understanding is that it will kill the 
> remaining healthy nodes or
> stop the services running on them, so it does not sound good for me ... 
> "freeze" : don't see the difference between "freeze" and "ignore" ... ?
> 
> Am I right ?
> 
> 2. and what about the quorum policy in a two-nodes cluster ?

You need working stonith and policy=ignore, as no node can have >50% on
its own. When the connection is lost, one node will shoot the other. The
cluster software should not be started at boot time, otherwise you will
end up in a stonith death match. There was quite a nice explanation on
the pacemaker list some time ago. Look for STONITH Deathmatch Explained
in the archives.

Regards
Dominik
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] probable bug with the OCF drbd script

2009-08-05 Thread Lars Ellenberg
On Wed, Aug 05, 2009 at 01:51:03PM +0200, Marc Cousin wrote:
> Hi,
> 
> I know this is a very old thread.
> 
> But I'm now trying heartbeat 2.99.2 (from the provided rpm packages) and I 
> still have these 'local's in the drbd script.

Please use DRBD 8.3.2 (or newer, in case someone digs up this thread in
six month again), which includes ocf/linbit/drbd RA.
usage is "compatible" to the old ocf/heartbeat/drbd one,
and documented in the DRBD User's Guide.


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Pacemaker 1.4 & HBv2 1.99 // About quorum choice (contd.)

2009-08-05 Thread Alain.Moulle
Thanks Andrew,

1. So my understanding is that in a "more than 2 nodes cluster" , if
two nodes are failed, the have_quorum is set to 0 by the cluster soft
and the behavior is choosen by the administrator with the no-quorum-policy
parameter. So the question is now : what is the best choice for 
no-quorum-policy
value ? My feeling is that "ignore" would be the best choice if all services
can run without problems on the remaining healthy nodes.

"suicide" or "stop" : my understanding is that it will kill the 
remaining healthy nodes or
stop the services running on them, so it does not sound good for me ...

"freeze" : don't see the difference between "freeze" and "ignore" ... ?

Am I right ?

2. and what about the quorum policy in a two-nodes cluster ?

Thanks
Alain

>
> There is only one way to get quorum, have more than half of the nodes online.
> You can look at the no-quorum-policy option though, that affects what
> the cluster does when it doesn't have quorum.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] alias interfaces and heartbeat 2 on centOS

2009-08-05 Thread Testuser SST
I didn´t know/read about any restrictions on alias-interfaces yet.

> eth0:0 isn't a real device, you may find that's the problem.
> 
> -Shane
> 
> On 05/08/2009, at 8:46 PM, Testuser SST wrote:
> 
> >
> >
> > yes it is, the drbd-device is working fine on that interface.
> >
> >
> >> This interface eth0:0 its up?
> >>
> >> 2009/8/5 Testuser SST 
> >>
> >>> Hi,
> >>>
> >>> is there a problem using alias-interfaces with heartbeat 2.1.3
> >> (CentOS-RPM)
> >>> ? I configured in the ha.cf something like:
> >>>
> >>> ucast eth0:0 192.168.95.13
> >>>
> >>> and got:
> >>>
> >>> debug: opening ucast eth0:0 (UDP/IP unicast)
> >>> heartbeat[23629]: 2009/08/05_14:21:31 info: glib: ucast: write  
> >>> socket
> >>> priority set to IPTOS_LOWDELAY on eth0:0
> >>> heartbeat[23629]: 2009/08/05_14:21:31 ERROR: glib: ucast: error  
> >>> setting
> >>> option SO_BINDTODEVICE(w) on eth0:0: No such device
> >>> heartbeat[23629]: 2009/08/05_14:21:31 ERROR: make_io_childpair:  
> >>> cannot
> >> open
> >>> ucast eth0:0
> >>> heartbeat[23629]: 2009/08/05_14:21:31 debug: Exiting from pid 23629
> >> [rc=1]
> >>> heartbeat[23632]: 2009/08/05_14:21:32 debug: pid 23632 locked in  
> >>> memory.
> >>> heartbeat[23632]: 2009/08/05_14:21:32 debug: Limiting CPU: 6 CPU  
> >>> seconds
> >>> every 6 milliseconds
> >>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown:  
> >>> Master
> >>> Control process died.
> >>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Killing pid 23629 with
> >> SIGTERM
> >>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown(MCP
> >> dead):
> >>> Killing ourselves.
> >>> heartbeat[23632]: 2009/08/05_14:21:33 debug: Process 23632  
> >>> processing
> >>> SIGTERM
> >>> heartbeat[23632]: 2009/08/05_14:21:33 debug: Exiting from pid 23632
> >> [rc=15]
> >>>
> >>> Any suggestions are welcome
> >>>
> >>> Kind Regards
> >>>
> >>> SST
> >>>
> >>> --
> >>> GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
> >>> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
> >>> ___
> >>> Linux-HA mailing list
> >>> Linux-HA@lists.linux-ha.org
> >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >>> See also: http://linux-ha.org/ReportingProblems
> >>>
> >>
> >>
> >>
> >> -- 
> >> Att,
> >> Maiquel
> >> ___
> >> Linux-HA mailing list
> >> Linux-HA@lists.linux-ha.org
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> >> See also: http://linux-ha.org/ReportingProblems
> >
> > -- 
> > GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
> > Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
> > ___
> > Linux-HA mailing list
> > Linux-HA@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> 
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

-- 
Neu: GMX Doppel-FLAT mit Internet-Flatrate + Telefon-Flatrate
für nur 19,99 Euro/mtl.!* http://portal.gmx.net/de/go/dsl02
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] alias interfaces and heartbeat 2 on centOS

2009-08-05 Thread Shane Short
eth0:0 isn't a real device, you may find that's the problem.

-Shane

On 05/08/2009, at 8:46 PM, Testuser SST wrote:

>
>
> yes it is, the drbd-device is working fine on that interface.
>
>
>> This interface eth0:0 its up?
>>
>> 2009/8/5 Testuser SST 
>>
>>> Hi,
>>>
>>> is there a problem using alias-interfaces with heartbeat 2.1.3
>> (CentOS-RPM)
>>> ? I configured in the ha.cf something like:
>>>
>>> ucast eth0:0 192.168.95.13
>>>
>>> and got:
>>>
>>> debug: opening ucast eth0:0 (UDP/IP unicast)
>>> heartbeat[23629]: 2009/08/05_14:21:31 info: glib: ucast: write  
>>> socket
>>> priority set to IPTOS_LOWDELAY on eth0:0
>>> heartbeat[23629]: 2009/08/05_14:21:31 ERROR: glib: ucast: error  
>>> setting
>>> option SO_BINDTODEVICE(w) on eth0:0: No such device
>>> heartbeat[23629]: 2009/08/05_14:21:31 ERROR: make_io_childpair:  
>>> cannot
>> open
>>> ucast eth0:0
>>> heartbeat[23629]: 2009/08/05_14:21:31 debug: Exiting from pid 23629
>> [rc=1]
>>> heartbeat[23632]: 2009/08/05_14:21:32 debug: pid 23632 locked in  
>>> memory.
>>> heartbeat[23632]: 2009/08/05_14:21:32 debug: Limiting CPU: 6 CPU  
>>> seconds
>>> every 6 milliseconds
>>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown:  
>>> Master
>>> Control process died.
>>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Killing pid 23629 with
>> SIGTERM
>>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown(MCP
>> dead):
>>> Killing ourselves.
>>> heartbeat[23632]: 2009/08/05_14:21:33 debug: Process 23632  
>>> processing
>>> SIGTERM
>>> heartbeat[23632]: 2009/08/05_14:21:33 debug: Exiting from pid 23632
>> [rc=15]
>>>
>>> Any suggestions are welcome
>>>
>>> Kind Regards
>>>
>>> SST
>>>
>>> --
>>> GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
>>> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
>>> ___
>>> Linux-HA mailing list
>>> Linux-HA@lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>
>>
>>
>> -- 
>> Att,
>> Maiquel
>> ___
>> Linux-HA mailing list
>> Linux-HA@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>
> -- 
> GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] alias interfaces and heartbeat 2 on centOS

2009-08-05 Thread Testuser SST


yes it is, the drbd-device is working fine on that interface.


> This interface eth0:0 its up?
> 
> 2009/8/5 Testuser SST 
> 
> > Hi,
> >
> > is there a problem using alias-interfaces with heartbeat 2.1.3
> (CentOS-RPM)
> > ? I configured in the ha.cf something like:
> >
> > ucast eth0:0 192.168.95.13
> >
> > and got:
> >
> > debug: opening ucast eth0:0 (UDP/IP unicast)
> > heartbeat[23629]: 2009/08/05_14:21:31 info: glib: ucast: write socket
> > priority set to IPTOS_LOWDELAY on eth0:0
> > heartbeat[23629]: 2009/08/05_14:21:31 ERROR: glib: ucast: error setting
> > option SO_BINDTODEVICE(w) on eth0:0: No such device
> > heartbeat[23629]: 2009/08/05_14:21:31 ERROR: make_io_childpair: cannot
> open
> > ucast eth0:0
> > heartbeat[23629]: 2009/08/05_14:21:31 debug: Exiting from pid 23629
> [rc=1]
> > heartbeat[23632]: 2009/08/05_14:21:32 debug: pid 23632 locked in memory.
> > heartbeat[23632]: 2009/08/05_14:21:32 debug: Limiting CPU: 6 CPU seconds
> > every 6 milliseconds
> > heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown: Master
> > Control process died.
> > heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Killing pid 23629 with
> SIGTERM
> > heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown(MCP
> dead):
> > Killing ourselves.
> > heartbeat[23632]: 2009/08/05_14:21:33 debug: Process 23632 processing
> > SIGTERM
> > heartbeat[23632]: 2009/08/05_14:21:33 debug: Exiting from pid 23632
> [rc=15]
> >
> > Any suggestions are welcome
> >
> > Kind Regards
> >
> > SST
> >
> > --
> > GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
> > Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
> > ___
> > Linux-HA mailing list
> > Linux-HA@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
> >
> 
> 
> 
> -- 
> Att,
> Maiquel
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] alias interfaces and heartbeat 2 on centOS

2009-08-05 Thread maike
This interface eth0:0 its up?

2009/8/5 Testuser SST 

> Hi,
>
> is there a problem using alias-interfaces with heartbeat 2.1.3 (CentOS-RPM)
> ? I configured in the ha.cf something like:
>
> ucast eth0:0 192.168.95.13
>
> and got:
>
> debug: opening ucast eth0:0 (UDP/IP unicast)
> heartbeat[23629]: 2009/08/05_14:21:31 info: glib: ucast: write socket
> priority set to IPTOS_LOWDELAY on eth0:0
> heartbeat[23629]: 2009/08/05_14:21:31 ERROR: glib: ucast: error setting
> option SO_BINDTODEVICE(w) on eth0:0: No such device
> heartbeat[23629]: 2009/08/05_14:21:31 ERROR: make_io_childpair: cannot open
> ucast eth0:0
> heartbeat[23629]: 2009/08/05_14:21:31 debug: Exiting from pid 23629 [rc=1]
> heartbeat[23632]: 2009/08/05_14:21:32 debug: pid 23632 locked in memory.
> heartbeat[23632]: 2009/08/05_14:21:32 debug: Limiting CPU: 6 CPU seconds
> every 6 milliseconds
> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown: Master
> Control process died.
> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Killing pid 23629 with SIGTERM
> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown(MCP dead):
> Killing ourselves.
> heartbeat[23632]: 2009/08/05_14:21:33 debug: Process 23632 processing
> SIGTERM
> heartbeat[23632]: 2009/08/05_14:21:33 debug: Exiting from pid 23632 [rc=15]
>
> Any suggestions are welcome
>
> Kind Regards
>
> SST
>
> --
> GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



-- 
Att,
Maiquel
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] alias interfaces and heartbeat 2 on centOS

2009-08-05 Thread Testuser SST
Hi,

is there a problem using alias-interfaces with heartbeat 2.1.3 (CentOS-RPM) ? I 
configured in the ha.cf something like:

ucast eth0:0 192.168.95.13

and got:

debug: opening ucast eth0:0 (UDP/IP unicast)
heartbeat[23629]: 2009/08/05_14:21:31 info: glib: ucast: write socket priority 
set to IPTOS_LOWDELAY on eth0:0
heartbeat[23629]: 2009/08/05_14:21:31 ERROR: glib: ucast: error setting option 
SO_BINDTODEVICE(w) on eth0:0: No such device
heartbeat[23629]: 2009/08/05_14:21:31 ERROR: make_io_childpair: cannot open 
ucast eth0:0
heartbeat[23629]: 2009/08/05_14:21:31 debug: Exiting from pid 23629 [rc=1]
heartbeat[23632]: 2009/08/05_14:21:32 debug: pid 23632 locked in memory.
heartbeat[23632]: 2009/08/05_14:21:32 debug: Limiting CPU: 6 CPU seconds every 
6 milliseconds
heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown: Master Control 
process died.
heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Killing pid 23629 with SIGTERM
heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown(MCP dead): 
Killing ourselves.
heartbeat[23632]: 2009/08/05_14:21:33 debug: Process 23632 processing SIGTERM
heartbeat[23632]: 2009/08/05_14:21:33 debug: Exiting from pid 23632 [rc=15]

Any suggestions are welcome

Kind Regards

SST

-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] probable bug with the OCF drbd script

2009-08-05 Thread Marc Cousin
Hi,

I know this is a very old thread.

But I'm now trying heartbeat 2.99.2 (from the provided rpm packages) and I 
still have these 'local's in the drbd script.

Is this normal ? Has the bug been corrected ?


On Monday 17 November 2008 12:35:20 Dejan Muhamedagic wrote:
> Hi,
>
> On Fri, Nov 14, 2008 at 09:13:54AM +0100, Marc Cousin wrote:
> > Hi,
> >
> > I've been fighting with the OCF DRBD script returning me success when
> > trying to get a resource secondary, when it failed : the drbdadm
> > secondary command fails (returns 11) and the drbd script returns 0. It's
> > with heartbeat 2.1.4.
> >
> > I think I've located the culprit :
> >
> > do_drbdadm() {
> > ?? local cmd="$DRBDADM -c $DRBDCONF $*"
> > ?? ocf_log debug "$RESOURCE: Calling $cmd"
> > ?? local cmd_out=$($cmd 2>&1)
> > ?? ret=$?
> > ?? # Trim the garbage drbdadm likes to print when using the
> > node ?? # override feature:
> > ?? local cmd_ret=$(echo $cmd_out | sed -e 's/found
> > __DRBD_NODE__.*< > ?? ocf_log err "$RESOURCE: Called $cmd"
> > ?? ocf_log err "$RESOURCE: Exit code $ret"
> > ?? ocf_log err "$RESOURCE: Command output:
> > $cmd_ret" ?? else
> > ?? ocf_log debug "$RESOURCE: Exit code $ret"
> > ?? ocf_log debug "$RESOURCE: Command output:
> > $cmd_ret" ?? fi
> > ?? echo $cmd_ret
> > ?? return $ret
> > }
> >
> >
> > local cmd_out=$($cmd 2>&1)
> > ret=$?
> >
> > In this case $? always is 0. As I don't know sh that much (and hate it a
> > lot :) ), I've been trying to find the reason, and as soon as I remove
> > the 'local', the return code is transmitted to $? again.
>
> True. Interesting that nobody found this before.
>
> > This is quite an important problem I think, because whenever heartbeat
> > fails to do a drbd command, it thinks it has worked (instead of
> > retrying).
>
> Fixed in the development repository.
>
> Cheers,
>
> Dejan
>
> > ___
> > Linux-HA mailing list
> > Linux-HA@lists.linux-ha.org
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha
> > See also: http://linux-ha.org/ReportingProblems
>
> ___
> Linux-HA mailing list
> Linux-HA@lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Pacemaker 1.4 & HBv2 1.99 // About quorum choice

2009-08-05 Thread Andrew Beekhof
On Wed, Aug 5, 2009 at 10:22 AM, Alain.Moulle wrote:
> Hello,
>
> I'm a little bit confusing about quorum configuration :
>  there is the have-quorum parameter which is normally
>  managed by the cluster itself.
>    On my configuration, its value is "0"
> But the Pacemaker documentation says :
>    "have-quorum : If false, this may mean that the cluster cannot start
> resources
>    or fence other nodes. "

0 == false

>
> So I guess it is quite mandatory to set have-quorum to "1" , isn't it ?

You can't set it, its a property of the cluster.
Any value you set will be overwritten by the actual quorum state the
cluster has.

>
> So I tried :
> crm_attribute --attr-name have-quorum --attr-value true
> The have-quorum is always "0" in the cib.xml " but in the " nv-pair :
>  value="true"/>
> so I guess it overloads the  But anyway, what is the best choice for have-quorum for a cluster of
> let's say between 2 and 8 nodes  ?

There is only one way to get quorum, have more than half of the nodes online.
You can look at the no-quorum-policy option though, that affects what
the cluster does when it doesn't have quorum.
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Command to see if a resource is started or not

2009-08-05 Thread Tobias Appel
On 08/05/2009 11:09 AM, Dominik Klein wrote:
> Tobias Appel wrote:
>> On 08/05/2009 10:30 AM, Dominik Klein wrote:
>>> Tobias Appel wrote:
 So all I need is a command line tool to check wether a resource is
 currently started or not. I tried to check the resources with the
 failcount command, but it's always 0. And the crm_resource command is
 used to configure a resource but does not seem to give me the status of
 a resource.

 I know I can use crm_mon but I would rather have a small command since I
 could include this in our monitoring tool (nagios).
>>> crm resource status
>>>
>>> Regards
>>> Dominik
>> Thanks for the fast reply Dominik,
>>
>> I forgot to mention that I still run Heartbeat version 2.1.4.
>> It seems crm_resource does not respond to the status flag. Or am I too
>> stupid?
>
> It is not crm_resource, I meant crm resource (notice the blank).
>
> But the crm command is not in 2.1.4
>
> Try crm_resource -W -r
>
> Regards
> Dominik


Thanks a lot - this is exactly what I needed!

___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Command to see if a resource is started or not

2009-08-05 Thread Dominik Klein
Tobias Appel wrote:
> On 08/05/2009 10:30 AM, Dominik Klein wrote:
>> Tobias Appel wrote:
>>> So all I need is a command line tool to check wether a resource is
>>> currently started or not. I tried to check the resources with the
>>> failcount command, but it's always 0. And the crm_resource command is
>>> used to configure a resource but does not seem to give me the status of
>>> a resource.
>>>
>>> I know I can use crm_mon but I would rather have a small command since I
>>> could include this in our monitoring tool (nagios).
>> crm resource status
>>
>> Regards
>> Dominik
> 
> Thanks for the fast reply Dominik,
> 
> I forgot to mention that I still run Heartbeat version 2.1.4.
> It seems crm_resource does not respond to the status flag. Or am I too 
> stupid?

It is not crm_resource, I meant crm resource (notice the blank).

But the crm command is not in 2.1.4

Try crm_resource -W -r 

Regards
Dominik
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Command to see if a resource is started or not

2009-08-05 Thread Tobias Appel
On 08/05/2009 10:30 AM, Dominik Klein wrote:
> Tobias Appel wrote:
>>
>> So all I need is a command line tool to check wether a resource is
>> currently started or not. I tried to check the resources with the
>> failcount command, but it's always 0. And the crm_resource command is
>> used to configure a resource but does not seem to give me the status of
>> a resource.
>>
>> I know I can use crm_mon but I would rather have a small command since I
>> could include this in our monitoring tool (nagios).
>
> crm resource status
>
> Regards
> Dominik

Thanks for the fast reply Dominik,

I forgot to mention that I still run Heartbeat version 2.1.4.
It seems crm_resource does not respond to the status flag. Or am I too 
stupid?

Bye,
Tobi
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Re: [Linux-HA] Command to see if a resource is started or not

2009-08-05 Thread Dominik Klein
Tobias Appel wrote:
> Hi,
> 
> I need a command to see if a resource is started or not. Somehow my IPMI 
> resource does not always start, especially on one node (for example if I 
> reboot the node, or have a failover). There is no error and nothing, it 
> just does nothing at all.
> Usually I have to clean up the resource and then it comes back by itself.
> This is not really a problem since this only occurs after a failover or 
> reboot and when that happens, somebody usually takes a look at the 
> cluster anyway. But some people forget to start it again, and when we do 
> maintenance we have to turn it off on purpose since it would go wreck 
> havoc and turn off one of the nodes.
> 
> So all I need is a command line tool to check wether a resource is 
> currently started or not. I tried to check the resources with the 
> failcount command, but it's always 0. And the crm_resource command is 
> used to configure a resource but does not seem to give me the status of 
> a resource.
> 
> I know I can use crm_mon but I would rather have a small command since I 
> could include this in our monitoring tool (nagios).

crm resource status 

Regards
Dominik
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Command to see if a resource is started or not

2009-08-05 Thread Tobias Appel
Hi,

I need a command to see if a resource is started or not. Somehow my IPMI 
resource does not always start, especially on one node (for example if I 
reboot the node, or have a failover). There is no error and nothing, it 
just does nothing at all.
Usually I have to clean up the resource and then it comes back by itself.
This is not really a problem since this only occurs after a failover or 
reboot and when that happens, somebody usually takes a look at the 
cluster anyway. But some people forget to start it again, and when we do 
maintenance we have to turn it off on purpose since it would go wreck 
havoc and turn off one of the nodes.

So all I need is a command line tool to check wether a resource is 
currently started or not. I tried to check the resources with the 
failcount command, but it's always 0. And the crm_resource command is 
used to configure a resource but does not seem to give me the status of 
a resource.

I know I can use crm_mon but I would rather have a small command since I 
could include this in our monitoring tool (nagios).

Thanks in advance,

Tobi
___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] Pacemaker 1.4 & HBv2 1.99 // About quorum choice

2009-08-05 Thread Alain.Moulle
Hello,

I'm a little bit confusing about quorum configuration :
  there is the have-quorum parameter which is normally
  managed by the cluster itself.
On my configuration, its value is "0"
But the Pacemaker documentation says :
"have-quorum : If false, this may mean that the cluster cannot start 
resources
or fence other nodes. "

So I guess it is quite mandatory to set have-quorum to "1" , isn't it ?

So I tried :
crm_attribute --attr-name have-quorum --attr-value true
The have-quorum is always "0" in the cib.xml "
so I guess it overloads the http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


[Linux-HA] ANNOUNCE: New Linux-HA repository structure

2009-08-05 Thread Andrew Beekhof
Lars has asked me to announce that at long last, we have finalized the  
new Linux-HA repository/project structure.

Effective immediately, Heartbeat 2.x has been split into the following  
projects:
* cluster-glue 1.0
* resource-agents 1.0
* heartbeat 3.0-beta

### Cluster Glue 1.0 - http://hg.linux-ha.org/glue/ - 
http://hg.linux-ha.org/glue/archive/glue-1.0.tar.gz
A collection of common tools that are useful for writing cluster  
stacks such as Heartbeat and cluster managers such as Pacemaker.
Provides a local resource manager that understands the OCF and LSB  
standards, and an interface to common STONITH devices.

### Resource Agents 1.0 - http://hg.linux-ha.org/agents/ - 
http://hg.linux-ha.org/agents/archive/agents-1.0.tar.gz
OCF compliant scripts to allow common services to operate in a High  
Availability environment.

### Heartbeat 3.0-beta - http://hg.linux-ha.org/dev/
A cluster stack providing messaging and membership services that can  
be used by resource managers such as Pacemaker.
Heartbeat still contains the simple 2-node resource manager (aka.  
haresources) from before version 2.
The board will release 3.0-final at a time of its choosing.


These changes have been put in place to allow the group to release  
updates at interval that are suitable to each individual project.
This also makes better use of our limited QA resources as we are no  
longer forced to test the entire stack in order to release an updated  
set of resource agents.

Additionally, the changes aim to increase the usage of the individual  
components by allowing them to be used independently.


Preliminary packages for the most recent openSUSE, SLES, Fedora and  
RHEL releases are currently available at
   http://download.opensuse.org/repositories/server:/ha-clustering:/NG

Older distros can be added if there is sufficient demand.
The existing repositories will be migrated to the new package layout  
over the coming days and weeks.

-- Andrew


___
Linux-HA mailing list
Linux-HA@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems