[Pacemaker] Node doesn't rejoin automatically after reboot

2010-09-03 Thread Tom Tux
Hi If I disjoin one clusternode (node01) for maintenance-purposes (/etc/init.d/openais stop) and reboot this node, then it will not join himself automatically into the cluster. After the reboot, I have the following error- and warn-messages in the log: Sep 3 07:34:09 node01 mgmtd: [9201]:

Re: [Pacemaker] pingd

2010-09-03 Thread Lars Ellenberg
On Thu, Sep 02, 2010 at 09:33:59PM +0200, Andrew Beekhof wrote: On Thu, Sep 2, 2010 at 4:05 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Thu, Sep 02, 2010 at 11:00:12AM +0200, Bernd Schubert wrote: On Thursday, September 02, 2010, Andrew Beekhof wrote: On Wed, Sep 1, 2010 at

Re: [Pacemaker] pingd

2010-09-03 Thread Bernd Schubert
On Friday, September 03, 2010, Lars Ellenberg wrote: how about an fping RA ? active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2/dev/null | wc -l) terminates in about 3 seconds for a hostlist of 100 (on the LAN, 29 of which are alive). Happy to add if someone writes it :-) I

Re: [Pacemaker] pingd

2010-09-03 Thread Lars Ellenberg
On Fri, Sep 03, 2010 at 12:12:58PM +0200, Bernd Schubert wrote: PS: (*) As you insist ;) on quorum with n/2 + 1 nodes, we use ping as replacement. We simply cannot fulfill n/2 + 1, as controller failure takes down 50% of the systems (virtual machines) and the systems (VMs) of the

Re: [Pacemaker] pingd

2010-09-03 Thread Andrew Beekhof
On Fri, Sep 3, 2010 at 9:38 AM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Thu, Sep 02, 2010 at 09:33:59PM +0200, Andrew Beekhof wrote: On Thu, Sep 2, 2010 at 4:05 PM, Lars Ellenberg lars.ellenb...@linbit.com wrote: On Thu, Sep 02, 2010 at 11:00:12AM +0200, Bernd Schubert wrote: On

Re: [Pacemaker] pingd

2010-09-03 Thread Andrew Beekhof
On Fri, Sep 3, 2010 at 12:12 PM, Bernd Schubert bs_li...@aakef.fastmail.fm wrote: On Friday, September 03, 2010, Lars Ellenberg wrote: how about an fping RA ? active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2/dev/null | wc -l) terminates in about 3 seconds for a hostlist of 100 (on

Re: [Pacemaker] pingd

2010-09-03 Thread Lars Ellenberg
On Fri, Sep 03, 2010 at 12:12:58PM +0200, Bernd Schubert wrote: On Friday, September 03, 2010, Lars Ellenberg wrote: how about an fping RA ? active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2/dev/null | wc -l) terminates in about 3 seconds for a hostlist of 100 (on the LAN, 29 of

Re: [Pacemaker] Setting up routing for a virtual ip

2010-09-03 Thread Stephan-Frank Henry
Original-Nachricht Datum: Thu, 02 Sep 2010 19:08:00 +0200 Von: Stephan-Frank Henry Frank dot Henry at gmx dot net An: The Pacemaker cluster resource manager pacemaker@oss.clusterlabs.org Betreff: Re: [Pacemaker] Setting up routing for a virtual ip

Re: [Pacemaker] Node doesn't rejoin automatically after reboot

2010-09-03 Thread Michael Smith
Tom Tux wrote: If I disjoin one clusternode (node01) for maintenance-purposes (/etc/init.d/openais stop) and reboot this node, then it will not join himself automatically into the cluster. After the reboot, I have the following error- and warn-messages in the log: Sep 3 07:34:15 node01 mgmtd:

[Pacemaker] MCP init script to 21/79?

2010-09-03 Thread Steven Dake
On 08/24/2010 11:06 PM, Andrew Beekhof wrote: On Wed, Aug 25, 2010 at 8:02 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: 25.08.2010 08:56, Andrew Beekhof wrote: On Wed, Aug 25, 2010 at 7:39 AM, Vladislav Bogdanov bub...@hoster-ok.com wrote: Hi all, pacemaker has # chkconfig - 90 90 in

Re: [Pacemaker] MCP init script to 21/79?

2010-09-03 Thread Vladislav Bogdanov
03.09.2010 19:34, Steven Dake wrote: Nope, they are in a natural order for both start and stop sequences. So lower number means 'do start or stop earlier'. grep '# chkconfig' /etc/init.d/* Ok, thanks. Changed to 10 Given that corosync default is 20/80, shouldnt mcp be 21/79? I think

Re: [Pacemaker] MCP init script to 21/79?

2010-09-03 Thread Steven Dake
On 09/03/2010 09:56 AM, Vladislav Bogdanov wrote: 03.09.2010 19:34, Steven Dake wrote: Nope, they are in a natural order for both start and stop sequences. So lower number means 'do start or stop earlier'. grep '# chkconfig' /etc/init.d/* Ok, thanks. Changed to 10 Given that corosync

[Pacemaker] Couldn't find device [/dev/drbd/by-res/wwwdata]. Expected /dev/??? to exist

2010-09-03 Thread Alisson Landim
I was following cluster from scratch guide and everything were fine until i get here: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch08s05.html The command: grep ERROR: /var/log/messages | grep -v unpack_resources Says: Couldn't find device

Re: [Pacemaker] Couldn't find device [/dev/drbd/by-res/wwwdata]. Expected /dev/??? to exist

2010-09-03 Thread Adam Gandelman
On 09/03/2010 01:05 PM, Alisson Landim wrote: I was following cluster from scratch guide and everything were fine until i get here: http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch08s05.html The command: grep ERROR: /var/log/messages | grep -v

Re: [Pacemaker] Couldn't find device [/dev/drbd/by-res/wwwdata]. Expected /dev/??? to exist

2010-09-03 Thread Alisson Landim
Adam Gandelman wrote: Don't know your DRBD version, yum install drbd-pacemaker returns: Package drbd-pacemaker-8.3.7-2.fc13.x86_64 already installed and latest version also drbdadm says: