Re: [Pacemaker] pingd

2010-09-03 Thread Lars Ellenberg
On Thu, Sep 02, 2010 at 09:33:59PM +0200, Andrew Beekhof wrote:
> On Thu, Sep 2, 2010 at 4:05 PM, Lars Ellenberg
>  wrote:
> > On Thu, Sep 02, 2010 at 11:00:12AM +0200, Bernd Schubert wrote:
> >> On Thursday, September 02, 2010, Andrew Beekhof wrote:
> >> > On Wed, Sep 1, 2010 at 11:59 AM, Bernd Schubert
> >> > > My proposal is to rip out all network code out of pingd and to add
> >> > > slightly modified files from 'iputils'.
> >> >
> >> > Close, but thats not portable.
> >> > Instead use ocf:pacemaker:ping which goes a step further and ditches
> >> > the daemon piece altogether.
> >>
> >> Hmm, we are already using that for now temporarily. But I don't think the 
> >> ping
> >> RA is suitable for larger clusters. The ping script RA runs everything
> >> serially and only in intervals when called by lrmd. Now lets assume we 
> >> have a
> >> 20 node cluster.
> >>
> >> nodes = 20
> >> timeout = 2
> >> attempts = 2
> >>
> >> Makes 80s for a single run with default already rather small timeouts, 
> >> which
> >> is IMHO a bit large. And with a shell script I don't see a way to improve
> >> that. While we could send the pings in parallel, I have no idea how to lock
> >> the variable of active nodes (active=`expr $active + 1`). I don't think 
> >> that
> >> the simple sh or even bash have a semaphore or mutex lock. So IMHO, we 
> >> need a
> >> language that supports that, rewriting the pingd RA is one choice, 
> >> rewriting
> >> the ping RA into python is another.
> >
> > how about an fping RA ?
> > active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2>/dev/null | wc -l)
> >
> > terminates in about 3 seconds for a hostlist of 100 (on the LAN, 29 of
> > which are alive).
> 
> Happy to add if someone writes it :-)

I thought so ;-)
Additional note to whomever is going to:

With fping you can get fancy about "better connectivity",
you are not limited to the measure "number of nodes responding".
You could also use the statistics on packet loss and rtt provided on
stderr for -c or -C mode (example output below, chose what you think is
easier to parse), then do some scoring scheme on average or max packet loss,
rtt, or whatever else makes sense to you.
(If a switch starts dying, it may produce increasing packet loss first...)

Or start a smokeping daemon,
and use the triggers there to change pacemaker attributes.
Uhm, well, thats probably no longer maintainable, though ;-)

# fping -q -i 5 -t 250 -B1 -r2 -C5 -g 10.9.9.50 10.9.9.70
10.9.9.50 : 0.14 0.14 0.16 0.12 0.15
10.9.9.51 : - - - - -
10.9.9.52 : - - - - -
10.9.9.53 : 0.37 0.34 0.36 0.34 0.34
10.9.9.54 : 0.13 0.12 0.13 0.12 0.13
10.9.9.55 : 0.17 0.15 0.16 0.12 0.22
10.9.9.56 : 0.32 0.32 0.31 0.41 0.36
10.9.9.57 : 0.35 0.33 0.32 0.34 0.32
10.9.9.58 : - - - - -
10.9.9.59 : - - - - -
10.9.9.60 : - - - - -
10.9.9.61 : - - - - -
10.9.9.62 : - - - - -
10.9.9.63 : - - - - -
10.9.9.64 : - - - - -
10.9.9.65 : 1.92 0.33 0.33 0.33 0.34
10.9.9.66 : - - - - -
10.9.9.67 : - - - - -
10.9.9.68 : - - - - -
10.9.9.69 : 0.15 0.14 0.17 0.13 0.14
10.9.9.70 : - - - - -

# fping -q -i 5 -t 250 -B1 -r2 -c5 -g 10.9.9.50 10.9.9.70
10.9.9.50 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.11/0.13/0.15
10.9.9.51 : xmt/rcv/%loss = 5/0/100%
10.9.9.52 : xmt/rcv/%loss = 5/0/100%
10.9.9.53 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.33/0.34/0.37
10.9.9.54 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.10/0.11/0.13
10.9.9.55 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.13/0.16/0.20
10.9.9.56 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.34/0.36/0.41
10.9.9.57 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.16/0.25/0.33
10.9.9.58 : xmt/rcv/%loss = 5/0/100%
10.9.9.59 : xmt/rcv/%loss = 5/0/100%
10.9.9.60 : xmt/rcv/%loss = 5/0/100%
10.9.9.61 : xmt/rcv/%loss = 5/0/100%
10.9.9.62 : xmt/rcv/%loss = 5/0/100%
10.9.9.63 : xmt/rcv/%loss = 5/0/100%
10.9.9.64 : xmt/rcv/%loss = 5/0/100%
10.9.9.65 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.28/0.32/0.34
10.9.9.66 : xmt/rcv/%loss = 5/0/100%
10.9.9.67 : xmt/rcv/%loss = 5/0/100%
10.9.9.68 : xmt/rcv/%loss = 5/0/100%
10.9.9.69 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.13/0.14/0.15
10.9.9.70 : xmt/rcv/%loss = 5/0/100%

> >> So in fact my first proposal also only was the first step - first add 
> >> better
> >> network code and then to make it multi-threaded - each ping host gets its 
> >> own
> >> thread.
> >
> > A working pingd daemon has the additional advantage that it can ask its
> > peers for their ping node count, before actually updating the attribute,
> > which should help with the "dampen race".
> 
> That happens at the attrd level in both cases.  pingd adds nothing here.

I thought pingd did the dampening itself, even communicated with its peer
pingd's, and there was no more dampening in attrd involved after that.
But If you say so. I never looked at pingd too closely.

> >> PS: (*) As you insist ;) on quorum with n/2 + 1 nodes, we use ping as
> >> replacement. We simply cannot fulfill n/2 + 1, as controller failure takes
> >> down 50% of the systems (virtual machines) and the systems (

Re: [Pacemaker] pingd

2010-09-03 Thread Bernd Schubert
On Friday, September 03, 2010, Lars Ellenberg wrote:
> > > how about an fping RA ?
> > > active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2>/dev/null | wc -l)
> > > 
> > > terminates in about 3 seconds for a hostlist of 100 (on the LAN, 29 of
> > > which are alive).
> > 
> > Happy to add if someone writes it :-)
> 
> I thought so ;-)
> Additional note to whomever is going to:
> 
> With fping you can get fancy about "better connectivity",
> you are not limited to the measure "number of nodes responding".

I think for the beginning, just the basic feature should be sufficient. 
Actually I thought about to add an option to the existing ping RA to let the 
user choose between ping and fping, it would default to ping. I will do that 
mid of next week.


> You could also use the statistics on packet loss and rtt provided on
> stderr for -c or -C mode (example output below, chose what you think is
> easier to parse), then do some scoring scheme on average or max packet
> loss, rtt, or whatever else makes sense to you.
> (If a switch starts dying, it may produce increasing packet loss first...)

That will require quite parsing, which I'm not comfortable with in a shell 
script. I have no objections to later on add fping RA written in perl or 
python.

[...]

> 
> > >> PS: (*) As you insist ;) on quorum with n/2 + 1 nodes, we use ping as
> > >> replacement. We simply cannot fulfill n/2 + 1, as controller failure
> > >> takes down 50% of the systems (virtual machines) and the systems
> > >> (VMs) of the 2nd controller are then supposed to take over failed
> > >> services. I see that n/2 + 1 is optimal and also required for a few
> > >> nodes. But if you have a larger set of system (e.g. minimum 6 with
> > >> the VM systems I have in my mind) n/2 + 1 is sufficient, IMHO.
> > > 
> > > You meant to say you consider == n/2 sufficient, instead of > n/2 ?
> 
> So you have a two node virtualization stuff, each hosting n/2 VMs,
> and do the pacemaker clustering between those VMs?

Yes.

> 
> I'm sure you could easily add "somewhere else" a very bare bone VM
> (or real) server, that is dedicated member of your cluster, but
> never takes any resources? Just serves as arbitrator? as your "+1"?

No, I'm afraid it is not that easy. There is simply is nothing that can be 
used. If there is anything, it is always available on both hosts/controllers. 
Imagine you would sell a standalone DRBD system (black box), that provides for 
example NFS to clients. You would want to have each and every additional 
service mirrored again. And you could not rely on additional customer NFS 
clients.

> 
> May be easier, safer, and more transparent than
> no-quorum=ignore plus some ping attribute based auto-shutdown.

I agree on safer and transparent, but unfortunately, it not easier in our 
case.

-- 
Bernd Schubert
DataDirect Networks

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] pingd

2010-09-03 Thread Lars Ellenberg
On Fri, Sep 03, 2010 at 12:12:58PM +0200, Bernd Schubert wrote:
> > > >> PS: (*) As you insist ;) on quorum with n/2 + 1 nodes, we use ping as
> > > >> replacement. We simply cannot fulfill n/2 + 1, as controller failure
> > > >> takes down 50% of the systems (virtual machines) and the systems
> > > >> (VMs) of the 2nd controller are then supposed to take over failed
> > > >> services. I see that n/2 + 1 is optimal and also required for a few
> > > >> nodes. But if you have a larger set of system (e.g. minimum 6 with
> > > >> the VM systems I have in my mind) n/2 + 1 is sufficient, IMHO.
> > > > 
> > > > You meant to say you consider == n/2 sufficient, instead of > n/2 ?
> > 
> > So you have a two node virtualization stuff, each hosting n/2 VMs,
> > and do the pacemaker clustering between those VMs?
> 
> Yes.
> 
> > 
> > I'm sure you could easily add "somewhere else" a very bare bone VM
> > (or real) server, that is dedicated member of your cluster, but
> > never takes any resources? Just serves as arbitrator? as your "+1"?
> 
> No, I'm afraid it is not that easy. There is simply is nothing that can be 
> used. If there is anything, it is always available on both hosts/controllers. 
> Imagine you would sell a standalone DRBD system (black box), that provides 
> for 
> example NFS to clients. You would want to have each and every additional 
> service mirrored again. And you could not rely on additional customer NFS 
> clients.
> 
> > 
> > May be easier, safer, and more transparent than
> > no-quorum=ignore plus some ping attribute based auto-shutdown.
> 
> I agree on safer and transparent, but unfortunately, it not easier in our 
> case.

Sell a Sheevaplug together with the two blackboxes,
and tell them to plug it into somewhere.
Yeah, well, maybe an other micro thingy with two ethernet adapters,
so you can plug it into your fully redundant network topology ;-)

Something like that.

-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] pingd

2010-09-03 Thread Andrew Beekhof
On Fri, Sep 3, 2010 at 9:38 AM, Lars Ellenberg
 wrote:
> On Thu, Sep 02, 2010 at 09:33:59PM +0200, Andrew Beekhof wrote:
>> On Thu, Sep 2, 2010 at 4:05 PM, Lars Ellenberg
>>  wrote:
>> > On Thu, Sep 02, 2010 at 11:00:12AM +0200, Bernd Schubert wrote:
>> >> On Thursday, September 02, 2010, Andrew Beekhof wrote:
>> >> > On Wed, Sep 1, 2010 at 11:59 AM, Bernd Schubert
>> >> > > My proposal is to rip out all network code out of pingd and to add
>> >> > > slightly modified files from 'iputils'.
>> >> >
>> >> > Close, but thats not portable.
>> >> > Instead use ocf:pacemaker:ping which goes a step further and ditches
>> >> > the daemon piece altogether.
>> >>
>> >> Hmm, we are already using that for now temporarily. But I don't think the 
>> >> ping
>> >> RA is suitable for larger clusters. The ping script RA runs everything
>> >> serially and only in intervals when called by lrmd. Now lets assume we 
>> >> have a
>> >> 20 node cluster.
>> >>
>> >> nodes = 20
>> >> timeout = 2
>> >> attempts = 2
>> >>
>> >> Makes 80s for a single run with default already rather small timeouts, 
>> >> which
>> >> is IMHO a bit large. And with a shell script I don't see a way to improve
>> >> that. While we could send the pings in parallel, I have no idea how to 
>> >> lock
>> >> the variable of active nodes (active=`expr $active + 1`). I don't think 
>> >> that
>> >> the simple sh or even bash have a semaphore or mutex lock. So IMHO, we 
>> >> need a
>> >> language that supports that, rewriting the pingd RA is one choice, 
>> >> rewriting
>> >> the ping RA into python is another.
>> >
>> > how about an fping RA ?
>> > active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2>/dev/null | wc -l)
>> >
>> > terminates in about 3 seconds for a hostlist of 100 (on the LAN, 29 of
>> > which are alive).
>>
>> Happy to add if someone writes it :-)
>
> I thought so ;-)
> Additional note to whomever is going to:
>
> With fping you can get fancy about "better connectivity",
> you are not limited to the measure "number of nodes responding".
> You could also use the statistics on packet loss and rtt provided on
> stderr for -c or -C mode (example output below, chose what you think is
> easier to parse), then do some scoring scheme on average or max packet loss,
> rtt, or whatever else makes sense to you.
> (If a switch starts dying, it may produce increasing packet loss first...)

This sounds great.
I think we want the ping RA to use fping where available.

>
> Or start a smokeping daemon,
> and use the triggers there to change pacemaker attributes.
> Uhm, well, thats probably no longer maintainable, though ;-)
>
> # fping -q -i 5 -t 250 -B1 -r2 -C5 -g 10.9.9.50 10.9.9.70
> 10.9.9.50 : 0.14 0.14 0.16 0.12 0.15
> 10.9.9.51 : - - - - -
> 10.9.9.52 : - - - - -
> 10.9.9.53 : 0.37 0.34 0.36 0.34 0.34
> 10.9.9.54 : 0.13 0.12 0.13 0.12 0.13
> 10.9.9.55 : 0.17 0.15 0.16 0.12 0.22
> 10.9.9.56 : 0.32 0.32 0.31 0.41 0.36
> 10.9.9.57 : 0.35 0.33 0.32 0.34 0.32
> 10.9.9.58 : - - - - -
> 10.9.9.59 : - - - - -
> 10.9.9.60 : - - - - -
> 10.9.9.61 : - - - - -
> 10.9.9.62 : - - - - -
> 10.9.9.63 : - - - - -
> 10.9.9.64 : - - - - -
> 10.9.9.65 : 1.92 0.33 0.33 0.33 0.34
> 10.9.9.66 : - - - - -
> 10.9.9.67 : - - - - -
> 10.9.9.68 : - - - - -
> 10.9.9.69 : 0.15 0.14 0.17 0.13 0.14
> 10.9.9.70 : - - - - -
>
> # fping -q -i 5 -t 250 -B1 -r2 -c5 -g 10.9.9.50 10.9.9.70
> 10.9.9.50 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.11/0.13/0.15
> 10.9.9.51 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.52 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.53 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.33/0.34/0.37
> 10.9.9.54 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.10/0.11/0.13
> 10.9.9.55 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.13/0.16/0.20
> 10.9.9.56 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.34/0.36/0.41
> 10.9.9.57 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.16/0.25/0.33
> 10.9.9.58 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.59 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.60 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.61 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.62 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.63 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.64 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.65 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.28/0.32/0.34
> 10.9.9.66 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.67 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.68 : xmt/rcv/%loss = 5/0/100%
> 10.9.9.69 : xmt/rcv/%loss = 5/5/0%, min/avg/max = 0.13/0.14/0.15
> 10.9.9.70 : xmt/rcv/%loss = 5/0/100%
>
>> >> So in fact my first proposal also only was the first step - first add 
>> >> better
>> >> network code and then to make it multi-threaded - each ping host gets its 
>> >> own
>> >> thread.
>> >
>> > A working pingd daemon has the additional advantage that it can ask its
>> > peers for their ping node count, before actually updating the attribute,
>> > which should help with the "dampen race".
>>
>> That happens at the attrd level in both cases.  pingd adds nothing here.
>
> I thought pingd did the dampening itself, even communicated with its peer
> pingd's, and there was no more 

Re: [Pacemaker] pingd

2010-09-03 Thread Andrew Beekhof
On Fri, Sep 3, 2010 at 12:12 PM, Bernd Schubert
 wrote:
> On Friday, September 03, 2010, Lars Ellenberg wrote:
>> > > how about an fping RA ?
>> > > active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2>/dev/null | wc -l)
>> > >
>> > > terminates in about 3 seconds for a hostlist of 100 (on the LAN, 29 of
>> > > which are alive).
>> >
>> > Happy to add if someone writes it :-)
>>
>> I thought so ;-)
>> Additional note to whomever is going to:
>>
>> With fping you can get fancy about "better connectivity",
>> you are not limited to the measure "number of nodes responding".
>
> I think for the beginning, just the basic feature should be sufficient.
> Actually I thought about to add an option to the existing ping RA to let the
> user choose between ping and fping, it would default to ping. I will do that
> mid of next week.

Could you make fping the default binary if its installed and send us
the patch when you\'re done? :-)

>
>
>> You could also use the statistics on packet loss and rtt provided on
>> stderr for -c or -C mode (example output below, chose what you think is
>> easier to parse), then do some scoring scheme on average or max packet
>> loss, rtt, or whatever else makes sense to you.
>> (If a switch starts dying, it may produce increasing packet loss first...)
>
> That will require quite parsing, which I'm not comfortable with in a shell
> script. I have no objections to later on add fping RA written in perl or
> python.
>
> [...]
>
>>
>> > >> PS: (*) As you insist ;) on quorum with n/2 + 1 nodes, we use ping as
>> > >> replacement. We simply cannot fulfill n/2 + 1, as controller failure
>> > >> takes down 50% of the systems (virtual machines) and the systems
>> > >> (VMs) of the 2nd controller are then supposed to take over failed
>> > >> services. I see that n/2 + 1 is optimal and also required for a few
>> > >> nodes. But if you have a larger set of system (e.g. minimum 6 with
>> > >> the VM systems I have in my mind) n/2 + 1 is sufficient, IMHO.
>> > >
>> > > You meant to say you consider == n/2 sufficient, instead of > n/2 ?
>>
>> So you have a two node virtualization stuff, each hosting n/2 VMs,
>> and do the pacemaker clustering between those VMs?
>
> Yes.
>
>>
>> I'm sure you could easily add "somewhere else" a very bare bone VM
>> (or real) server, that is dedicated member of your cluster, but
>> never takes any resources? Just serves as arbitrator? as your "+1"?
>
> No, I'm afraid it is not that easy. There is simply is nothing that can be
> used. If there is anything, it is always available on both hosts/controllers.
> Imagine you would sell a standalone DRBD system (black box), that provides for
> example NFS to clients. You would want to have each and every additional
> service mirrored again. And you could not rely on additional customer NFS
> clients.
>
>>
>> May be easier, safer, and more transparent than
>> no-quorum=ignore plus some ping attribute based auto-shutdown.
>
> I agree on safer and transparent, but unfortunately, it not easier in our
> case.
>
> --
> Bernd Schubert
> DataDirect Networks
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: 
> http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] pingd

2010-09-03 Thread Lars Ellenberg
On Fri, Sep 03, 2010 at 12:12:58PM +0200, Bernd Schubert wrote:
> On Friday, September 03, 2010, Lars Ellenberg wrote:
> > > > how about an fping RA ?
> > > > active=$(fping -a -i 5 -t 250 -B1 -r1 $host_list 2>/dev/null | wc -l)
> > > > 
> > > > terminates in about 3 seconds for a hostlist of 100 (on the LAN, 29 of
> > > > which are alive).
> > > 
> > > Happy to add if someone writes it :-)
> > 
> > I thought so ;-)
> > Additional note to whomever is going to:
> > 
> > With fping you can get fancy about "better connectivity",
> > you are not limited to the measure "number of nodes responding".
> 
> I think for the beginning, just the basic feature should be sufficient. 
> Actually I thought about to add an option to the existing ping RA to let the 
> user choose between ping and fping, it would default to ping. I will do that 
> mid of next week.
> 
> 
> > You could also use the statistics on packet loss and rtt provided on
> > stderr for -c or -C mode (example output below, chose what you think is
> > easier to parse), then do some scoring scheme on average or max packet
> > loss, rtt, or whatever else makes sense to you.
> > (If a switch starts dying, it may produce increasing packet loss first...)
> 
> That will require quite parsing, which I'm not comfortable with in a shell 
> script. I have no objections to later on add fping RA written in perl or 
> python.

-s causes a summary to be displayed, like so:
fping -s -q -c 2 $host_list
  21 targets
   6 alive
  15 unreachable
   0 unknown addresses

   0 timeouts (waiting for response)
  42 ICMP Echos sent
  12 ICMP Echo Replies received
   0 other ICMP received

 0.11 ms (min round trip time)
 0.47 ms (avg round trip time)
 2.22 ms (max round trip time)
2.219 sec (elapsed real time)

That is easy enough to parse even from a shell script.

You just chose if you want to use "numbers alive", or
"number replies received", the latter accounting for packet loss.

If you want, you can resort to awk ;-)
N_REPLIES=$(fping -s -q -c $N_PINGS $host_list 2>&1 |
awk '/ICMP Echo Replies received/ { print $1 }')
# update the attribute with $N_REPLIES, instead of number alive,
# maybe scale with N_PINGS and/or N_HOSTS ...


-- 
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Setting up routing for a virtual ip

2010-09-03 Thread Stephan-Frank Henry

 Original-Nachricht 
> Datum: Thu, 02 Sep 2010 19:08:00 +0200
> Von: "Stephan-Frank Henry" 
> An: The Pacemaker cluster resource manager 
> Betreff: Re: [Pacemaker] Setting up routing for a virtual ip

>  Original-Nachricht 
> > Datum: Thu, 02 Sep 2010 11:40:13 +0200
> > Von: "Stephan-Frank Henry" 
> > An: pacemaker@oss.clusterlabs.org
> > Betreff: [Pacemaker] Setting up routing for a virtual ip
> 
> > Hello everyone,
> > 
> > I am currently stuck trying to set up routing for a configured virtual
> ip
> > to the static ip on the same host.
> > 
> > static ip: 150.158.1.2
> > (I have two nics in use, but this is the important one)
> > virtual ip: 1.2.3.4
> > nic: eth0
> > 
> > versions:
> > Debian Lenny 2.6.33.3 x86_64
> > corosync : 1.2.1-1
> > libheartbeat2 : 3.0.3-2
> > 
> > Here are the relevant parts:
> >  > provider="heartbeat">
> >   
> > 
> >   
> >   
> >> value="22"/>
> > 
> >   
> >   
> > 
> >   
> > 
> >  provider="heartbeat">
> >   
> > 
> >   
> >value="150.158.1.2"/>
> >> value="0.0.0.0/0"/>
> >   
> > 
> >   
> >   
> >  > name="monitor"/>
> >   
> > 
> > 
> > But when I run it, it prints out messages like:
> > Route[25503]: ERROR: ip_gateway Failed to add network route: to
> 0.0.0.0/0
> > via 150.158.1.2 src 1.2.3.5
> > WARN: unpack_rsc_op: Processing failed op ip_gateway_start_0 on
> nodealpha:
> > unknown error (1)
> > 
> > I have tried it with variations (f.i. leaving out the device) but
> without
> > success.
> > 
> > If I remove the routing config, it works fine.
> > 
> > What am I missing?
> > 
> > Could it be related to the fact that I do not see a virtual interface
> via
> > ifconfig (-a)?
> > 
> > thanks
> > 
> > Frank
> 
> Self-update :D
> 
> I have updated the settings and now am only using IPaddr instead of the *2
> version and now at least I can see the virtual ip.
> 
> I also commented out the device (dunno if I should put in eth0 or eth0:0)
> and changed the virtual ip to 150.158.1.5
> 
> now I am getting this error
> 
> crmd: [20637]: info: do_lrm_rsc_op: Performing
> key=35:1:0:d010917f-1f67-415a-b02b-97c784c1974f op=ip_gateway_start_0 )
> lrmd: [20634]: info: rsc:ip_gateway:15: start
> crmd: [20637]: info: te_rsc_command: Initiating action 35: start
> ip_gateway_start_0 on nodealpha (local)
> crmd: [20637]: info: process_lrm_event: LRM operation
> ip_resource_monitor_1 (call=14, rc=0, cib-update=41, confirmed=false) ok
> lrmd: [20634]: info: RA output: (ip_gateway:start:stderr) RTNETLINK
> answers: File exists
> crmd: [20637]: info: match_graph_event: Action ip_resource_monitor_1
> (34) confirmed on nodealpha (rc=0)
> Route[21137]: ERROR: ip_gateway Failed to add network route: to 0.0.0.0/0
> via 150.158.1.2
> crmd: [20637]: info: process_lrm_event: LRM operation ip_gateway_start_0
> (call=15, rc=1, cib-update=42, confirmed=true) unknown error
> 
> Anyone?
> 
> BTW: I'm not a network expert, so please highlight any issues.
> 
> thanks
> 
> Frank

Oookay, so it looks like one of the issues is my near complete lack of 
knowledge of networking.

From one of our resident networking guys it seems as though I need iptables to 
solve this issue.

Something like
iptables -t nat -A PREROUTING -p tcp -i eth0 -d 150.158.1.5 -j DNAT --to 
150.158.1.2
iptables -A FORWARD -p tcp -i eth0 -d 150.158.1.5 -j ACCEPT

Is there any way to natively embed this into the resource management or do I 
need to have my own scripts?
I checked the files but I could not really find anything useful.

thanks

frank
-- 
GMX DSL SOMMER-SPECIAL: Surf & Phone Flat 16.000 für nur 19,99 Euro/mtl.!*
http://portal.gmx.net/de/go/dsl

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Node doesn't rejoin automatically after reboot

2010-09-03 Thread Michael Smith

Tom Tux wrote:


If I disjoin one clusternode (node01) for maintenance-purposes
(/etc/init.d/openais stop) and reboot this node, then it will not join
himself automatically into the cluster. After the reboot, I have the
following error- and warn-messages in the log:

Sep  3 07:34:15 node01 mgmtd: [9202]: info: login to cib failed: live


Do you have messages like this, too?

Aug 30 15:48:10 xen-test1 corosync[5851]:  [IPC   ] Invalid IPC credentials.
Aug 30 15:48:10 xen-test1 cib: [5858]: info: init_ais_connection:
Connection to our AIS plugin (9) failed: unknown (100)

Aug 30 15:48:10 xen-test1 cib: [5858]: CRIT: cib_init: Cannot sign in to
the cluster... terminating


http://news.gmane.org/find-root.php?message_id=%3c4C7C0EC7.2050708%40cbnco.com%3e

Mike

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] MCP init script to 21/79?

2010-09-03 Thread Steven Dake

On 08/24/2010 11:06 PM, Andrew Beekhof wrote:

On Wed, Aug 25, 2010 at 8:02 AM, Vladislav Bogdanov
  wrote:

25.08.2010 08:56, Andrew Beekhof wrote:

On Wed, Aug 25, 2010 at 7:39 AM, Vladislav Bogdanov
  wrote:

Hi all,

pacemaker has
# chkconfig - 90 90
in its MCP initscript.

Shouldn't it be corrected to 90 10?


I thought higher numbers started later and shut down earlier... no?


Nope, they are in a natural order for both start and stop sequences.
So lower number means 'do start or stop earlier'.

grep '# chkconfig' /etc/init.d/*



Ok, thanks.  Changed to 10



Given that corosync default is 20/80, shouldnt mcp be 21/79?

Regards
-steve

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] MCP init script to 21/79?

2010-09-03 Thread Vladislav Bogdanov
03.09.2010 19:34, Steven Dake wrote:
>>> Nope, they are in a natural order for both start and stop sequences.
>>> So lower number means 'do start or stop earlier'.
>>>
>>> grep '# chkconfig' /etc/init.d/*
>>>
>>
>> Ok, thanks.  Changed to 10
>>
> 
> Given that corosync default is 20/80, shouldnt mcp be 21/79?

I think that pcmk may require additional services to be started (I at
least see reference to cooperation with cman for GFS as one of pcmk MCP
scenarios in Andrew's wiki, but that scenario is still unclear to me),
so it is safer to have it start later, 90 is ok for me. That is also
what Vadim wrote about.

Best,
Vladislav

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] MCP init script to 21/79?

2010-09-03 Thread Steven Dake

On 09/03/2010 09:56 AM, Vladislav Bogdanov wrote:

03.09.2010 19:34, Steven Dake wrote:

Nope, they are in a natural order for both start and stop sequences.
So lower number means 'do start or stop earlier'.

grep '# chkconfig' /etc/init.d/*



Ok, thanks.  Changed to 10



Given that corosync default is 20/80, shouldnt mcp be 21/79?


I think that pcmk may require additional services to be started (I at
least see reference to cooperation with cman for GFS as one of pcmk MCP
scenarios in Andrew's wiki, but that scenario is still unclear to me),
so it is safer to have it start later, 90 is ok for me. That is also
what Vadim wrote about.

Best,
Vladislav


I was mistaken, not having read the current code.  Ignore the noise.

Regards
-steve

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


[Pacemaker] Couldn't find device [/dev/drbd/by-res/wwwdata]. Expected /dev/??? to exist

2010-09-03 Thread Alisson Landim

I was following cluster from scratch guide and everything were fine until i get 
here:

http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch08s05.html

The command:
grep ERROR: /var/log/messages | grep -v unpack_resources

Says:
Couldn't find device [/dev/drbd/by-res/wwwdata]. Expected /dev/??? to exist

Crm_mon:
WebFS:0_start_0 (node=Master, call=53, rc=5, status=complete): not installed
WebFS:1_start_0 (node=HPHDX, call=46, rc=5, status=complete): not installed

I double checked and certified that everything the guide said to install i did:
yum install -y drbd-pacemaker
yum install -y gfs2-utils gfs-pcmk

Is something i didn't install? 
the directory -> /dev/drbd/by-res/wwwdata does not exist, i think it should...

Any hint?
  ___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Couldn't find device [/dev/drbd/by-res/wwwdata]. Expected /dev/??? to exist

2010-09-03 Thread Adam Gandelman
On 09/03/2010 01:05 PM, Alisson Landim wrote:
> I was following cluster from scratch guide and everything were fine
> until i get here:
>
> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch08s05.html
>
> The command:
> grep ERROR: /var/log/messages | grep -v unpack_resources
>
> Says:
> Couldn't find device [/dev/drbd/by-res/wwwdata]. Expected /dev/??? to
> existdation.org/enter_bug.cgi?product=Pacemaker

Don't know your DRBD version, but make sure you've got the corresponding
udev stuff installed.   If not, replace references to /dev/drbd/by-res/*
with their actual device nodes (/dev/drbd0, for example)



-- 
: Adam Gandelman
: LINBIT | Your Way to High Availability
: Telephone: 503-573-1262 ext. 203
: Sales: 1-877-4-LINBIT / 1-877-454-6248
:
: 7959 SW Cirrus Dr.
: Beaverton, OR 97008
:
: http://www.linbit.com 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


Re: [Pacemaker] Couldn't find device [/dev/drbd/by-res/wwwdata]. Expected /dev/??? to exist

2010-09-03 Thread Alisson Landim

Adam Gandelman wrote:   "Don't know your 
DRBD version,"

yum install drbd-pacemaker returns:
Package 
drbd-pacemaker-8.3.7-2.fc13.x86_64 already installed and latest version
also drbdadm says:
 Version: 8.3.7 (api:88)

Adam Gandelman wrote:  
"but make sure you've got the
corresponding udev stuff installed"

How do i check that?

Adam Gandelman wrote:If not, replace 
references to
/dev/drbd/by-res/* with their actual device nodes (/dev/drbd0, for
example)

Ok, i replaced with drbd1 which is my device and got on var/log/messages:

crmd: [28175]: ERROR: process_lrm_event: Op dlm:1_start_0 (call=10): Cancelled
crmd: [28175]: ERROR: do_lrm_invoke: Not creating resource for a delete event: 
(null)
Filesystem[31537]: ERROR: Couldn't mount filesystem /dev/drbd1 on /var/www/html
crmd: [30593]: ERROR: process_lrm_event: LRM operation WebFS:1_stop_0 (24) 
Timed Out (timeout=10ms)

And my system broadcast a message of kernel bug:

Package:kernel
Latest Crash:Fri 03 Sep 2010 07:44:56 PM 
Command:not_applicable
Reason: kernel BUG at fs/dlm/lock.c:242!
Comment:None
Bug Reports:

kernel:invalid opcode:  [#1] SMP
kernel:last sysfs file: /sys/kernel/dlm/web/event_done
kernel:Stack:
kernel:Call Trace:
kernel:Code: 8b e0 00 00 00 8b 73 3c 44 8b 83 d0 00 00 00 48 c7 c7 38 c7 18 a0 
31 c0 e8 85 f7 2c e1 48 c7 c7 6d c7 18 a0 31 c0 e8 77 f7 2c e1 <0f> 0b eb fe 59 
0f 95 c0 0f b6 c0 5b c9 c3 55 48 89 e5 53 48 83




Date: Fri, 3 Sep 2010 15:16:13 -0700
From: adam.gandel...@linbit.com
To: pacemaker@oss.clusterlabs.org
Subject: Re: [Pacemaker] Couldn't find device [/dev/drbd/by-res/wwwdata]. 
Expected /dev/??? to exist






  
  Message body


On 09/03/2010 01:05 PM, Alisson Landim wrote:

  
I was following cluster from scratch guide and everything were fine
until i get here:

  

http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch08s05.html

  

The command:

grep ERROR: /var/log/messages | grep -v unpack_resources

  

Says:

Couldn't find device [/dev/drbd/by-res/wwwdata]. Expected /dev/??? to
existdation.org/enter_bug.cgi?product=Pacemaker




Don't know your DRBD version, but make sure you've got the
corresponding udev stuff installed.   If not, replace references to
/dev/drbd/by-res/* with their actual device nodes (/dev/drbd0, for
example)







-- 
: Adam Gandelman
: LINBIT | Your Way to High Availability
: Telephone: 503-573-1262 ext. 203
: Sales: 1-877-4-LINBIT / 1-877-454-6248
:
: 7959 SW Cirrus Dr.
: Beaverton, OR 97008
:
: http://www.linbit.com 



___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker 
  ___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker