Re: [Linux-HA] How to remove nodes with hb_gui
On Wed, 2009-08-05 at 20:42 -0400, Bernie Wu wrote: > Hi Listers, > How can I remove nodes that currently appear in my Linux HA Management Client > ? If it's heartbeat based cluster, first you should run hb_delnode to delete the nodes. And then delete them from cib: If you are using the latest cluster stack, you could either delete them via the GUI if you have pacemaker-mgmt installed, Or run "crm node delete ...". If you are still using heartbeat-2.1, you have to run cibadmin to delete them. > These nodes belong to another cluster and they appear as stopped. > > TIA > Bernie > > > The information contained in this e-mail message is intended only for the > personal and confidential use of the recipient(s) named above. This message > may be an attorney-client communication and/or work product and as such is > privileged and confidential. If the reader of this message is not the > intended recipient or an agent responsible for delivering it to the intended > recipient, you are hereby notified that you have received this document in > error and that any review, dissemination, distribution, or copying of this > message is strictly prohibited. If you have received this communication in > error, please notify us immediately by e-mail, and delete the original > message. > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems -- Regards, Yan Gao China R&D Software Engineer y...@novell.com Novell, Inc. Making IT Work As One?6?4 ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] How to remove nodes with hb_gui
Hi Listers, How can I remove nodes that currently appear in my Linux HA Management Client ? These nodes belong to another cluster and they appear as stopped. TIA Bernie The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. This message may be an attorney-client communication and/or work product and as such is privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Try #2 on DRBD
I actually found exactly what you said 5 minutes after I sent the email. Would have been great for backing up the filesystem. I have heartbeat up and running for a 3rd IP and found a bit to mount it: grandpa IPaddr::192.168.0.243/24/eth0 Filesystem::/dev/drbd0::/data::xfs::defaults Is that all I need to have the backup machine (grandma) mount /data when grandpa fails? Robert On 8/5/09 4:56 PM, Brian R. Hellman wrote: > There is no way to mount a secondary device without first promoting it > to primary. > If what you're looking to do is have one server primary with a read-only > secondary, it is not possible. > > Your secondary servers 'cat /proc/drbd' should look similar with the > exception of Primary/Secondary are reversed. > > Brian > > Robert L. Harris wrote: > >> I have remade the system using single primary since that's all that >> is needed in reality. >> At current I have this on the primary: >> >> r...@grandpa:~# cat /proc/drbd >> version: 8.3.0 (api:88/proto:86-89) >> GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by >> iv...@ubuntu, 2009-01-17 07:49:56 >>0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r--- >> ns:825415525 nr:0 dw:133509 dr:825282384 al:37 bm:50372 lo:0 pe:0 >> ua:0 ap:0 ep:1 wo:b oos:0 >> >> I get the same off the secondary system as well. When I try to mount >> the xfs filesystem I am >> getting this: >> >> r...@grandma:~# mount -t xfs -o ro /dev/drbd0 /data/ >> mount: Wrong medium type >> >> I am looking but don't see a doc that says how to mount the image read >> only on the secondary >> machine. >> >> Robert >> >> >> > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > -- :wq! Robert L. Harris | GPG Key ID: E344DA3B @ x-hkp://pgp.mit.edu DISCLAIMER: These are MY OPINIONS With Dreams To Be A King, ALONE. I speak for First One Should Be A Man no-one else. - Manowar ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Try #2 on DRBD
There is no way to mount a secondary device without first promoting it to primary. If what you're looking to do is have one server primary with a read-only secondary, it is not possible. Your secondary servers 'cat /proc/drbd' should look similar with the exception of Primary/Secondary are reversed. Brian Robert L. Harris wrote: >I have remade the system using single primary since that's all that > is needed in reality. > At current I have this on the primary: > > r...@grandpa:~# cat /proc/drbd > version: 8.3.0 (api:88/proto:86-89) > GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by > iv...@ubuntu, 2009-01-17 07:49:56 > 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r--- > ns:825415525 nr:0 dw:133509 dr:825282384 al:37 bm:50372 lo:0 pe:0 > ua:0 ap:0 ep:1 wo:b oos:0 > > I get the same off the secondary system as well. When I try to mount > the xfs filesystem I am > getting this: > > r...@grandma:~# mount -t xfs -o ro /dev/drbd0 /data/ > mount: Wrong medium type > > I am looking but don't see a doc that says how to mount the image read > only on the secondary > machine. > > Robert > > ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Try #2 on DRBD
I have remade the system using single primary since that's all that is needed in reality. At current I have this on the primary: r...@grandpa:~# cat /proc/drbd version: 8.3.0 (api:88/proto:86-89) GIT-hash: 9ba8b93e24d842f0dd3fb1f9b90e8348ddb95829 build by iv...@ubuntu, 2009-01-17 07:49:56 0: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r--- ns:825415525 nr:0 dw:133509 dr:825282384 al:37 bm:50372 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0 I get the same off the secondary system as well. When I try to mount the xfs filesystem I am getting this: r...@grandma:~# mount -t xfs -o ro /dev/drbd0 /data/ mount: Wrong medium type I am looking but don't see a doc that says how to mount the image read only on the secondary machine. Robert -- :wq! Robert L. Harris | GPG Key ID: E344DA3B @ x-hkp://pgp.mit.edu DISCLAIMER: These are MY OPINIONS With Dreams To Be A King, ALONE. I speak for First One Should Be A Man no-one else. - Manowar ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Getting heartbeat to report current node
I am not using crm, because I find that it does not work reliably, so I am using the 2.0.8 in v1 mode essentially. I use monit to detect status of my services and issue restart or hb_standby if needed. Is 2.0.8 not a good version? It was the latest version I saw available in rpm for my redhat ... -Original Message- From: linux-ha-boun...@lists.linux-ha.org [mailto:linux-ha-boun...@lists.linux-ha.org] On Behalf Of Andrew Beekhof Sent: Wednesday, August 05, 2009 1:35 AM To: General Linux-HA mailing list Subject: Re: [Linux-HA] Getting heartbeat to report current node On Tue, Aug 4, 2009 at 6:16 PM, Cantwell, Bryan wrote: > I am running heartbeat 2.0.8 on linux. yikes! > I'm building a web interface to show information about my cluster. What > command can I use to ask heartbeat which is the node that is currently active? Depends which resource manager you're using. If your resources are configured in cib.xml, run crm_mon. If you're using haresources... dunno ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] probable bug with the OCF drbd script
On Wednesday 05 August 2009 15:46:26 Lars Ellenberg wrote: > On Wed, Aug 05, 2009 at 01:51:03PM +0200, Marc Cousin wrote: > > Hi, > > > > I know this is a very old thread. > > > > But I'm now trying heartbeat 2.99.2 (from the provided rpm packages) and > > I still have these 'local's in the drbd script. > > Please use DRBD 8.3.2 (or newer, in case someone digs up this thread in > six month again), which includes ocf/linbit/drbd RA. > usage is "compatible" to the old ocf/heartbeat/drbd one, > and documented in the DRBD User's Guide. I will do that. But if the other is obsolete, shouldn't it be removed ? ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Pacemaker 1.4 & HBv2 1.99 // About quorum choice (contd.)
Alain.Moulle wrote: > Thanks Andrew, > > 1. So my understanding is that in a "more than 2 nodes cluster" , if > two nodes are failed, the have_quorum is set to 0 by the cluster soft > and the behavior is choosen by the administrator with the no-quorum-policy > parameter. So the question is now : what is the best choice for > no-quorum-policy > value ? My feeling is that "ignore" would be the best choice if all services > can run without problems on the remaining healthy nodes. That's not the only case this can happen. If you run into split-brain, each node may be healthy but the network connections may be broken. With "ignore", you will end up with resources running multiple times. That's a problem sometimes ;) Don't use ignore in >2 node clusters. > "suicide" or "stop" : my understanding is that it will kill the > remaining healthy nodes or > stop the services running on them, so it does not sound good for me ... > "freeze" : don't see the difference between "freeze" and "ignore" ... ? > > Am I right ? > > 2. and what about the quorum policy in a two-nodes cluster ? You need working stonith and policy=ignore, as no node can have >50% on its own. When the connection is lost, one node will shoot the other. The cluster software should not be started at boot time, otherwise you will end up in a stonith death match. There was quite a nice explanation on the pacemaker list some time ago. Look for STONITH Deathmatch Explained in the archives. Regards Dominik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] probable bug with the OCF drbd script
On Wed, Aug 05, 2009 at 01:51:03PM +0200, Marc Cousin wrote: > Hi, > > I know this is a very old thread. > > But I'm now trying heartbeat 2.99.2 (from the provided rpm packages) and I > still have these 'local's in the drbd script. Please use DRBD 8.3.2 (or newer, in case someone digs up this thread in six month again), which includes ocf/linbit/drbd RA. usage is "compatible" to the old ocf/heartbeat/drbd one, and documented in the DRBD User's Guide. -- : Lars Ellenberg : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Pacemaker 1.4 & HBv2 1.99 // About quorum choice (contd.)
Thanks Andrew, 1. So my understanding is that in a "more than 2 nodes cluster" , if two nodes are failed, the have_quorum is set to 0 by the cluster soft and the behavior is choosen by the administrator with the no-quorum-policy parameter. So the question is now : what is the best choice for no-quorum-policy value ? My feeling is that "ignore" would be the best choice if all services can run without problems on the remaining healthy nodes. "suicide" or "stop" : my understanding is that it will kill the remaining healthy nodes or stop the services running on them, so it does not sound good for me ... "freeze" : don't see the difference between "freeze" and "ignore" ... ? Am I right ? 2. and what about the quorum policy in a two-nodes cluster ? Thanks Alain > > There is only one way to get quorum, have more than half of the nodes online. > You can look at the no-quorum-policy option though, that affects what > the cluster does when it doesn't have quorum. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] alias interfaces and heartbeat 2 on centOS
I didn´t know/read about any restrictions on alias-interfaces yet. > eth0:0 isn't a real device, you may find that's the problem. > > -Shane > > On 05/08/2009, at 8:46 PM, Testuser SST wrote: > > > > > > > yes it is, the drbd-device is working fine on that interface. > > > > > >> This interface eth0:0 its up? > >> > >> 2009/8/5 Testuser SST > >> > >>> Hi, > >>> > >>> is there a problem using alias-interfaces with heartbeat 2.1.3 > >> (CentOS-RPM) > >>> ? I configured in the ha.cf something like: > >>> > >>> ucast eth0:0 192.168.95.13 > >>> > >>> and got: > >>> > >>> debug: opening ucast eth0:0 (UDP/IP unicast) > >>> heartbeat[23629]: 2009/08/05_14:21:31 info: glib: ucast: write > >>> socket > >>> priority set to IPTOS_LOWDELAY on eth0:0 > >>> heartbeat[23629]: 2009/08/05_14:21:31 ERROR: glib: ucast: error > >>> setting > >>> option SO_BINDTODEVICE(w) on eth0:0: No such device > >>> heartbeat[23629]: 2009/08/05_14:21:31 ERROR: make_io_childpair: > >>> cannot > >> open > >>> ucast eth0:0 > >>> heartbeat[23629]: 2009/08/05_14:21:31 debug: Exiting from pid 23629 > >> [rc=1] > >>> heartbeat[23632]: 2009/08/05_14:21:32 debug: pid 23632 locked in > >>> memory. > >>> heartbeat[23632]: 2009/08/05_14:21:32 debug: Limiting CPU: 6 CPU > >>> seconds > >>> every 6 milliseconds > >>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown: > >>> Master > >>> Control process died. > >>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Killing pid 23629 with > >> SIGTERM > >>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown(MCP > >> dead): > >>> Killing ourselves. > >>> heartbeat[23632]: 2009/08/05_14:21:33 debug: Process 23632 > >>> processing > >>> SIGTERM > >>> heartbeat[23632]: 2009/08/05_14:21:33 debug: Exiting from pid 23632 > >> [rc=15] > >>> > >>> Any suggestions are welcome > >>> > >>> Kind Regards > >>> > >>> SST > >>> > >>> -- > >>> GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! > >>> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 > >>> ___ > >>> Linux-HA mailing list > >>> Linux-HA@lists.linux-ha.org > >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >>> See also: http://linux-ha.org/ReportingProblems > >>> > >> > >> > >> > >> -- > >> Att, > >> Maiquel > >> ___ > >> Linux-HA mailing list > >> Linux-HA@lists.linux-ha.org > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha > >> See also: http://linux-ha.org/ReportingProblems > > > > -- > > GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! > > Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 > > ___ > > Linux-HA mailing list > > Linux-HA@lists.linux-ha.org > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems -- Neu: GMX Doppel-FLAT mit Internet-Flatrate + Telefon-Flatrate für nur 19,99 Euro/mtl.!* http://portal.gmx.net/de/go/dsl02 ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] alias interfaces and heartbeat 2 on centOS
eth0:0 isn't a real device, you may find that's the problem. -Shane On 05/08/2009, at 8:46 PM, Testuser SST wrote: > > > yes it is, the drbd-device is working fine on that interface. > > >> This interface eth0:0 its up? >> >> 2009/8/5 Testuser SST >> >>> Hi, >>> >>> is there a problem using alias-interfaces with heartbeat 2.1.3 >> (CentOS-RPM) >>> ? I configured in the ha.cf something like: >>> >>> ucast eth0:0 192.168.95.13 >>> >>> and got: >>> >>> debug: opening ucast eth0:0 (UDP/IP unicast) >>> heartbeat[23629]: 2009/08/05_14:21:31 info: glib: ucast: write >>> socket >>> priority set to IPTOS_LOWDELAY on eth0:0 >>> heartbeat[23629]: 2009/08/05_14:21:31 ERROR: glib: ucast: error >>> setting >>> option SO_BINDTODEVICE(w) on eth0:0: No such device >>> heartbeat[23629]: 2009/08/05_14:21:31 ERROR: make_io_childpair: >>> cannot >> open >>> ucast eth0:0 >>> heartbeat[23629]: 2009/08/05_14:21:31 debug: Exiting from pid 23629 >> [rc=1] >>> heartbeat[23632]: 2009/08/05_14:21:32 debug: pid 23632 locked in >>> memory. >>> heartbeat[23632]: 2009/08/05_14:21:32 debug: Limiting CPU: 6 CPU >>> seconds >>> every 6 milliseconds >>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown: >>> Master >>> Control process died. >>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Killing pid 23629 with >> SIGTERM >>> heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown(MCP >> dead): >>> Killing ourselves. >>> heartbeat[23632]: 2009/08/05_14:21:33 debug: Process 23632 >>> processing >>> SIGTERM >>> heartbeat[23632]: 2009/08/05_14:21:33 debug: Exiting from pid 23632 >> [rc=15] >>> >>> Any suggestions are welcome >>> >>> Kind Regards >>> >>> SST >>> >>> -- >>> GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! >>> Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 >>> ___ >>> Linux-HA mailing list >>> Linux-HA@lists.linux-ha.org >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha >>> See also: http://linux-ha.org/ReportingProblems >>> >> >> >> >> -- >> Att, >> Maiquel >> ___ >> Linux-HA mailing list >> Linux-HA@lists.linux-ha.org >> http://lists.linux-ha.org/mailman/listinfo/linux-ha >> See also: http://linux-ha.org/ReportingProblems > > -- > GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! > Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] alias interfaces and heartbeat 2 on centOS
yes it is, the drbd-device is working fine on that interface. > This interface eth0:0 its up? > > 2009/8/5 Testuser SST > > > Hi, > > > > is there a problem using alias-interfaces with heartbeat 2.1.3 > (CentOS-RPM) > > ? I configured in the ha.cf something like: > > > > ucast eth0:0 192.168.95.13 > > > > and got: > > > > debug: opening ucast eth0:0 (UDP/IP unicast) > > heartbeat[23629]: 2009/08/05_14:21:31 info: glib: ucast: write socket > > priority set to IPTOS_LOWDELAY on eth0:0 > > heartbeat[23629]: 2009/08/05_14:21:31 ERROR: glib: ucast: error setting > > option SO_BINDTODEVICE(w) on eth0:0: No such device > > heartbeat[23629]: 2009/08/05_14:21:31 ERROR: make_io_childpair: cannot > open > > ucast eth0:0 > > heartbeat[23629]: 2009/08/05_14:21:31 debug: Exiting from pid 23629 > [rc=1] > > heartbeat[23632]: 2009/08/05_14:21:32 debug: pid 23632 locked in memory. > > heartbeat[23632]: 2009/08/05_14:21:32 debug: Limiting CPU: 6 CPU seconds > > every 6 milliseconds > > heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown: Master > > Control process died. > > heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Killing pid 23629 with > SIGTERM > > heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown(MCP > dead): > > Killing ourselves. > > heartbeat[23632]: 2009/08/05_14:21:33 debug: Process 23632 processing > > SIGTERM > > heartbeat[23632]: 2009/08/05_14:21:33 debug: Exiting from pid 23632 > [rc=15] > > > > Any suggestions are welcome > > > > Kind Regards > > > > SST > > > > -- > > GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! > > Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 > > ___ > > Linux-HA mailing list > > Linux-HA@lists.linux-ha.org > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > > > > > -- > Att, > Maiquel > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] alias interfaces and heartbeat 2 on centOS
This interface eth0:0 its up? 2009/8/5 Testuser SST > Hi, > > is there a problem using alias-interfaces with heartbeat 2.1.3 (CentOS-RPM) > ? I configured in the ha.cf something like: > > ucast eth0:0 192.168.95.13 > > and got: > > debug: opening ucast eth0:0 (UDP/IP unicast) > heartbeat[23629]: 2009/08/05_14:21:31 info: glib: ucast: write socket > priority set to IPTOS_LOWDELAY on eth0:0 > heartbeat[23629]: 2009/08/05_14:21:31 ERROR: glib: ucast: error setting > option SO_BINDTODEVICE(w) on eth0:0: No such device > heartbeat[23629]: 2009/08/05_14:21:31 ERROR: make_io_childpair: cannot open > ucast eth0:0 > heartbeat[23629]: 2009/08/05_14:21:31 debug: Exiting from pid 23629 [rc=1] > heartbeat[23632]: 2009/08/05_14:21:32 debug: pid 23632 locked in memory. > heartbeat[23632]: 2009/08/05_14:21:32 debug: Limiting CPU: 6 CPU seconds > every 6 milliseconds > heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown: Master > Control process died. > heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Killing pid 23629 with SIGTERM > heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown(MCP dead): > Killing ourselves. > heartbeat[23632]: 2009/08/05_14:21:33 debug: Process 23632 processing > SIGTERM > heartbeat[23632]: 2009/08/05_14:21:33 debug: Exiting from pid 23632 [rc=15] > > Any suggestions are welcome > > Kind Regards > > SST > > -- > GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! > Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems > -- Att, Maiquel ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] alias interfaces and heartbeat 2 on centOS
Hi, is there a problem using alias-interfaces with heartbeat 2.1.3 (CentOS-RPM) ? I configured in the ha.cf something like: ucast eth0:0 192.168.95.13 and got: debug: opening ucast eth0:0 (UDP/IP unicast) heartbeat[23629]: 2009/08/05_14:21:31 info: glib: ucast: write socket priority set to IPTOS_LOWDELAY on eth0:0 heartbeat[23629]: 2009/08/05_14:21:31 ERROR: glib: ucast: error setting option SO_BINDTODEVICE(w) on eth0:0: No such device heartbeat[23629]: 2009/08/05_14:21:31 ERROR: make_io_childpair: cannot open ucast eth0:0 heartbeat[23629]: 2009/08/05_14:21:31 debug: Exiting from pid 23629 [rc=1] heartbeat[23632]: 2009/08/05_14:21:32 debug: pid 23632 locked in memory. heartbeat[23632]: 2009/08/05_14:21:32 debug: Limiting CPU: 6 CPU seconds every 6 milliseconds heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown: Master Control process died. heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Killing pid 23629 with SIGTERM heartbeat[23632]: 2009/08/05_14:21:33 CRIT: Emergency Shutdown(MCP dead): Killing ourselves. heartbeat[23632]: 2009/08/05_14:21:33 debug: Process 23632 processing SIGTERM heartbeat[23632]: 2009/08/05_14:21:33 debug: Exiting from pid 23632 [rc=15] Any suggestions are welcome Kind Regards SST -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01 ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] probable bug with the OCF drbd script
Hi, I know this is a very old thread. But I'm now trying heartbeat 2.99.2 (from the provided rpm packages) and I still have these 'local's in the drbd script. Is this normal ? Has the bug been corrected ? On Monday 17 November 2008 12:35:20 Dejan Muhamedagic wrote: > Hi, > > On Fri, Nov 14, 2008 at 09:13:54AM +0100, Marc Cousin wrote: > > Hi, > > > > I've been fighting with the OCF DRBD script returning me success when > > trying to get a resource secondary, when it failed : the drbdadm > > secondary command fails (returns 11) and the drbd script returns 0. It's > > with heartbeat 2.1.4. > > > > I think I've located the culprit : > > > > do_drbdadm() { > > ?? local cmd="$DRBDADM -c $DRBDCONF $*" > > ?? ocf_log debug "$RESOURCE: Calling $cmd" > > ?? local cmd_out=$($cmd 2>&1) > > ?? ret=$? > > ?? # Trim the garbage drbdadm likes to print when using the > > node ?? # override feature: > > ?? local cmd_ret=$(echo $cmd_out | sed -e 's/found > > __DRBD_NODE__.*< > ?? ocf_log err "$RESOURCE: Called $cmd" > > ?? ocf_log err "$RESOURCE: Exit code $ret" > > ?? ocf_log err "$RESOURCE: Command output: > > $cmd_ret" ?? else > > ?? ocf_log debug "$RESOURCE: Exit code $ret" > > ?? ocf_log debug "$RESOURCE: Command output: > > $cmd_ret" ?? fi > > ?? echo $cmd_ret > > ?? return $ret > > } > > > > > > local cmd_out=$($cmd 2>&1) > > ret=$? > > > > In this case $? always is 0. As I don't know sh that much (and hate it a > > lot :) ), I've been trying to find the reason, and as soon as I remove > > the 'local', the return code is transmitted to $? again. > > True. Interesting that nobody found this before. > > > This is quite an important problem I think, because whenever heartbeat > > fails to do a drbd command, it thinks it has worked (instead of > > retrying). > > Fixed in the development repository. > > Cheers, > > Dejan > > > ___ > > Linux-HA mailing list > > Linux-HA@lists.linux-ha.org > > http://lists.linux-ha.org/mailman/listinfo/linux-ha > > See also: http://linux-ha.org/ReportingProblems > > ___ > Linux-HA mailing list > Linux-HA@lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha > See also: http://linux-ha.org/ReportingProblems ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Pacemaker 1.4 & HBv2 1.99 // About quorum choice
On Wed, Aug 5, 2009 at 10:22 AM, Alain.Moulle wrote: > Hello, > > I'm a little bit confusing about quorum configuration : > there is the have-quorum parameter which is normally > managed by the cluster itself. > On my configuration, its value is "0" > But the Pacemaker documentation says : > "have-quorum : If false, this may mean that the cluster cannot start > resources > or fence other nodes. " 0 == false > > So I guess it is quite mandatory to set have-quorum to "1" , isn't it ? You can't set it, its a property of the cluster. Any value you set will be overwritten by the actual quorum state the cluster has. > > So I tried : > crm_attribute --attr-name have-quorum --attr-value true > The have-quorum is always "0" in the cib.xml " but in the " nv-pair : > value="true"/> > so I guess it overloads the But anyway, what is the best choice for have-quorum for a cluster of > let's say between 2 and 8 nodes ? There is only one way to get quorum, have more than half of the nodes online. You can look at the no-quorum-policy option though, that affects what the cluster does when it doesn't have quorum. ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Command to see if a resource is started or not
On 08/05/2009 11:09 AM, Dominik Klein wrote: > Tobias Appel wrote: >> On 08/05/2009 10:30 AM, Dominik Klein wrote: >>> Tobias Appel wrote: So all I need is a command line tool to check wether a resource is currently started or not. I tried to check the resources with the failcount command, but it's always 0. And the crm_resource command is used to configure a resource but does not seem to give me the status of a resource. I know I can use crm_mon but I would rather have a small command since I could include this in our monitoring tool (nagios). >>> crm resource status >>> >>> Regards >>> Dominik >> Thanks for the fast reply Dominik, >> >> I forgot to mention that I still run Heartbeat version 2.1.4. >> It seems crm_resource does not respond to the status flag. Or am I too >> stupid? > > It is not crm_resource, I meant crm resource (notice the blank). > > But the crm command is not in 2.1.4 > > Try crm_resource -W -r > > Regards > Dominik Thanks a lot - this is exactly what I needed! ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Command to see if a resource is started or not
Tobias Appel wrote: > On 08/05/2009 10:30 AM, Dominik Klein wrote: >> Tobias Appel wrote: >>> So all I need is a command line tool to check wether a resource is >>> currently started or not. I tried to check the resources with the >>> failcount command, but it's always 0. And the crm_resource command is >>> used to configure a resource but does not seem to give me the status of >>> a resource. >>> >>> I know I can use crm_mon but I would rather have a small command since I >>> could include this in our monitoring tool (nagios). >> crm resource status >> >> Regards >> Dominik > > Thanks for the fast reply Dominik, > > I forgot to mention that I still run Heartbeat version 2.1.4. > It seems crm_resource does not respond to the status flag. Or am I too > stupid? It is not crm_resource, I meant crm resource (notice the blank). But the crm command is not in 2.1.4 Try crm_resource -W -r Regards Dominik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Command to see if a resource is started or not
On 08/05/2009 10:30 AM, Dominik Klein wrote: > Tobias Appel wrote: >> >> So all I need is a command line tool to check wether a resource is >> currently started or not. I tried to check the resources with the >> failcount command, but it's always 0. And the crm_resource command is >> used to configure a resource but does not seem to give me the status of >> a resource. >> >> I know I can use crm_mon but I would rather have a small command since I >> could include this in our monitoring tool (nagios). > > crm resource status > > Regards > Dominik Thanks for the fast reply Dominik, I forgot to mention that I still run Heartbeat version 2.1.4. It seems crm_resource does not respond to the status flag. Or am I too stupid? Bye, Tobi ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
Re: [Linux-HA] Command to see if a resource is started or not
Tobias Appel wrote: > Hi, > > I need a command to see if a resource is started or not. Somehow my IPMI > resource does not always start, especially on one node (for example if I > reboot the node, or have a failover). There is no error and nothing, it > just does nothing at all. > Usually I have to clean up the resource and then it comes back by itself. > This is not really a problem since this only occurs after a failover or > reboot and when that happens, somebody usually takes a look at the > cluster anyway. But some people forget to start it again, and when we do > maintenance we have to turn it off on purpose since it would go wreck > havoc and turn off one of the nodes. > > So all I need is a command line tool to check wether a resource is > currently started or not. I tried to check the resources with the > failcount command, but it's always 0. And the crm_resource command is > used to configure a resource but does not seem to give me the status of > a resource. > > I know I can use crm_mon but I would rather have a small command since I > could include this in our monitoring tool (nagios). crm resource status Regards Dominik ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Command to see if a resource is started or not
Hi, I need a command to see if a resource is started or not. Somehow my IPMI resource does not always start, especially on one node (for example if I reboot the node, or have a failover). There is no error and nothing, it just does nothing at all. Usually I have to clean up the resource and then it comes back by itself. This is not really a problem since this only occurs after a failover or reboot and when that happens, somebody usually takes a look at the cluster anyway. But some people forget to start it again, and when we do maintenance we have to turn it off on purpose since it would go wreck havoc and turn off one of the nodes. So all I need is a command line tool to check wether a resource is currently started or not. I tried to check the resources with the failcount command, but it's always 0. And the crm_resource command is used to configure a resource but does not seem to give me the status of a resource. I know I can use crm_mon but I would rather have a small command since I could include this in our monitoring tool (nagios). Thanks in advance, Tobi ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] Pacemaker 1.4 & HBv2 1.99 // About quorum choice
Hello, I'm a little bit confusing about quorum configuration : there is the have-quorum parameter which is normally managed by the cluster itself. On my configuration, its value is "0" But the Pacemaker documentation says : "have-quorum : If false, this may mean that the cluster cannot start resources or fence other nodes. " So I guess it is quite mandatory to set have-quorum to "1" , isn't it ? So I tried : crm_attribute --attr-name have-quorum --attr-value true The have-quorum is always "0" in the cib.xml " so I guess it overloads the http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
[Linux-HA] ANNOUNCE: New Linux-HA repository structure
Lars has asked me to announce that at long last, we have finalized the new Linux-HA repository/project structure. Effective immediately, Heartbeat 2.x has been split into the following projects: * cluster-glue 1.0 * resource-agents 1.0 * heartbeat 3.0-beta ### Cluster Glue 1.0 - http://hg.linux-ha.org/glue/ - http://hg.linux-ha.org/glue/archive/glue-1.0.tar.gz A collection of common tools that are useful for writing cluster stacks such as Heartbeat and cluster managers such as Pacemaker. Provides a local resource manager that understands the OCF and LSB standards, and an interface to common STONITH devices. ### Resource Agents 1.0 - http://hg.linux-ha.org/agents/ - http://hg.linux-ha.org/agents/archive/agents-1.0.tar.gz OCF compliant scripts to allow common services to operate in a High Availability environment. ### Heartbeat 3.0-beta - http://hg.linux-ha.org/dev/ A cluster stack providing messaging and membership services that can be used by resource managers such as Pacemaker. Heartbeat still contains the simple 2-node resource manager (aka. haresources) from before version 2. The board will release 3.0-final at a time of its choosing. These changes have been put in place to allow the group to release updates at interval that are suitable to each individual project. This also makes better use of our limited QA resources as we are no longer forced to test the entire stack in order to release an updated set of resource agents. Additionally, the changes aim to increase the usage of the individual components by allowing them to be used independently. Preliminary packages for the most recent openSUSE, SLES, Fedora and RHEL releases are currently available at http://download.opensuse.org/repositories/server:/ha-clustering:/NG Older distros can be added if there is sufficient demand. The existing repositories will be migrated to the new package layout over the coming days and weeks. -- Andrew ___ Linux-HA mailing list Linux-HA@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems