Re: [Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets

2012-01-15 Thread Andrew Beekhof
On Thu, Jan 12, 2012 at 11:01 PM, Florian Haas  wrote:
> On Thu, Jan 5, 2012 at 10:15 PM, Florian Haas  wrote:
>> Florian Haas (2):
>>      extra: add rsyslog configuration snippet
>>      extra: add logrotate configuration snippet
>>
>>  configure.ac                      |    4 +++
>>  extra/Makefile.am                 |    2 +-
>>  extra/logrotate/Makefile.am       |    5 
>>  extra/logrotate/pacemaker.conf.in |    7 ++
>>  extra/rsyslog/Makefile.am         |    5 
>>  extra/rsyslog/pacemaker.conf.in   |   39 
>> +
>>  6 files changed, 61 insertions(+), 1 deletions(-)
>>  create mode 100644 extra/logrotate/Makefile.am
>>  create mode 100644 extra/logrotate/pacemaker.conf.in
>>  create mode 100644 extra/rsyslog/Makefile.am
>>  create mode 100644 extra/rsyslog/pacemaker.conf.in
>
> Any takers on these?

Sorry, I was off working on the new fencing logic and then corosync
2.0 support (when cman and all the plugins, including ours, go away).

So a couple of comments...

I fully agree that the state of our logging needs work and I can
understand people wanting to keep the vast majority of our logs out of
syslog.
I'm less thrilled about one-file-per-subsystem, the cluster will often
do a lot within a single second and splitting everything up really
hurts the ability to correlate messages.
I'd also suggest that /some/ information not coming directly from the
RAs is still appropriate for syslog (such as "I'm going to move A from
B to C" or "I'm about to turn of node D"), so the nuclear option isn't
really thrilling me.

In addition to the above distractions, I've been coming up to speed on
libqb's logging which is opening up a lot of new doors and should
hopefully help solve the underlying log issues.
For starters it lets syslog/stderr/logfile all log at different levels
of verbosity (and formats), it also supports blackboxes of which a
dump can be triggered in response to an error condition or manually by
the admin.

The plan is something along the lines of: syslog gets NOTICE and
above, anything else (depending on debug level and trace options) goes
to /var/log/(cluster/?)pacemaker or whatever was configured in
corosync.
However, before I can enact that there will need to be an audit of the
messages currently going to INFO (674 entries) and NOTICE(160 entries)
with some getting bumped up, others down (possibly even to debug).
I'd certainly be interested in feedback as to which logs should and
should not make it.

If you want to get analytical about it, there is also an awk script
that I use when looking at what we log.
I'd be interested in some numbers from the field.

-- Andrew


#!/bin/bash

 awk 'BEGIN{
 keys[0] = "openais"
 keys[1] = "heartbeat:"
 keys[2] = "ccm:"
 keys[3] = "lrmd:"
 keys[4] = "crmd:"
 keys[5] = "pengine:"
 keys[6] = "cib:"
 keys[7] = "CTS:"
 keys[8] = "stonithd:"

 format[0] = ""
 format[1] = ""
 format[2] = "\t"
 format[3] = "\t"
 format[4] = "\t"
 format[5] = ""
 format[6] = "\t"
 format[7] = "\t"
 format[8] = ""

 l_format[0] = "\t"
 l_format[1] = "\t"
 l_format[2] = "\t"
 l_format[3] = ""
 l_format[4] = "\t"
 l_format[5] = "\t"

 level[0] = "CRIT:"
 level[1] = "ERROR:"
 level[2] = "WARN:"
 level[3] = "notice:"
 level[4] = "info:"
 level[5] = "debug:"

 max = 9;
 l_max = 6;

 for( i = 0 ;i < max; i++){
 values[i] = 0;
 }
 for( i = 0 ;i < l_max; i++){
 l_values[i] = 0;
 }
 }
{
i =0
while( i < max ){
if ( NF < 5) {
break
}

if ( $5 == keys[i]){
values[i]++
i =0
while( i < l_max ){
if ( NF < 5) {
break
}

if ( $7 == level[i]){
l_values[i]++
break
}
i++
}
break
}
i++
}


}END{
total = 0

for( i = 0 ;i < max; i++){
total=values[i] + total
}

print "total line number is " total

print "progs", "\t","\t",  "#of lines","\t" "percentage"

for( i = 0 ;i < max; i++){
print keys[i], format[i],"\t",  values[i],"\t\t"
values[i]/total*100 "%"
}

print "\nLog levels:"
for( i = 0 ;i < l_max; i++){
   

Re: [Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets

2012-01-15 Thread Florian Haas
On Sun, Jan 15, 2012 at 9:27 PM, Andrew Beekhof  wrote:
> On Thu, Jan 12, 2012 at 11:01 PM, Florian Haas  wrote:
>> On Thu, Jan 5, 2012 at 10:15 PM, Florian Haas  wrote:
>>> Florian Haas (2):
>>>      extra: add rsyslog configuration snippet
>>>      extra: add logrotate configuration snippet
>>>
>>>  configure.ac                      |    4 +++
>>>  extra/Makefile.am                 |    2 +-
>>>  extra/logrotate/Makefile.am       |    5 
>>>  extra/logrotate/pacemaker.conf.in |    7 ++
>>>  extra/rsyslog/Makefile.am         |    5 
>>>  extra/rsyslog/pacemaker.conf.in   |   39 
>>> +
>>>  6 files changed, 61 insertions(+), 1 deletions(-)
>>>  create mode 100644 extra/logrotate/Makefile.am
>>>  create mode 100644 extra/logrotate/pacemaker.conf.in
>>>  create mode 100644 extra/rsyslog/Makefile.am
>>>  create mode 100644 extra/rsyslog/pacemaker.conf.in
>>
>> Any takers on these?
>
> Sorry, I was off working on the new fencing logic and then corosync
> 2.0 support (when cman and all the plugins, including ours, go away).
>
> So a couple of comments...
>
> I fully agree that the state of our logging needs work and I can
> understand people wanting to keep the vast majority of our logs out of
> syslog.
> I'm less thrilled about one-file-per-subsystem, the cluster will often
> do a lot within a single second and splitting everything up really
> hurts the ability to correlate messages.
> I'd also suggest that /some/ information not coming directly from the
> RAs is still appropriate for syslog (such as "I'm going to move A from
> B to C" or "I'm about to turn of node D"), so the nuclear option isn't
> really thrilling me.

So everything that is logged by the RAs with ocf_log, as I wrote in
the original post, _is_ still going to whatever the default syslog
destination may be. The rsyslog config doesn't change that at all.
(Stuff that the RAs simply barf out to stdout/err would go to the lrmd
log.) I maintain that this is the stuff that is also most useful to
people. And with just that information in the syslog, you usually get
a pretty clear idea of what the heck the cluster is doing on a node,
and in what order, in about 20 lines of logs close together -- rather
than intermingled with potentially hundreds of lines of other
cluster-related log output.

And disabling the "nuclear option" is a simple means of adding a "#"
before "& ~" in the config file. You can ship it that way by default
if you think that's more appropriate. That way, people would get the
split-out logs _plus_ everything in one file, which IMHO is sometimes
very useful for pengine or lrmd troubleshooting/debugging. I,
personally, just don't want Pacemaker to flood my /var/log/messages,
so I'd definitely leave the "& ~" in there, but that may be personal
preference. I wonder what others think.

> In addition to the above distractions, I've been coming up to speed on
> libqb's logging which is opening up a lot of new doors and should
> hopefully help solve the underlying log issues.
> For starters it lets syslog/stderr/logfile all log at different levels
> of verbosity (and formats), it also supports blackboxes of which a
> dump can be triggered in response to an error condition or manually by
> the admin.
>
> The plan is something along the lines of: syslog gets NOTICE and
> above, anything else (depending on debug level and trace options) goes
> to /var/log/(cluster/?)pacemaker or whatever was configured in
> corosync.
> However, before I can enact that there will need to be an audit of the
> messages currently going to INFO (674 entries) and NOTICE(160 entries)
> with some getting bumped up, others down (possibly even to debug).
> I'd certainly be interested in feedback as to which logs should and
> should not make it.

Yes, even so, I (again, this is personal preference) would definitely
not want pengine logging (which even if half its INFO messages get
demoted to DEBUG, would still be pretty verbose) in my default
messages file.

> If you want to get analytical about it, there is also an awk script
> that I use when looking at what we log.
> I'd be interested in some numbers from the field.

Thanks; I can look at that after LCA.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About the rotation of the pe-file.

2012-01-15 Thread renayama19661014
Hi Lars,
Hi Andrew,

> If you want it to be between [0, max-1],
> obviously that should be
> while(max > 0 && sequence >= max) {
> sequence -= max;
> }

Thanks!!I try it.

> Though I wonder why not simply:
> if (max == 0)
> return;
> if (sequence > max)
> sequence = 0;

I wondered, too.
However, I thought that a cord of Mr. Andrew might have some special 
specifications.


Best Regards,
Hideo Yamauchi.

--- On Sat, 2012/1/14, Lars Ellenberg  wrote:

> On Fri, Jan 06, 2012 at 10:12:06AM +0900, renayama19661...@ybb.ne.jp wrote:
> > Hi Andrew,
> > 
> > Thank you for comments.
> > 
> > > Could you try with:
> > > 
> > >         while(max >= 0 && sequence > max) {
> > > 
> > 
> > The problem is not settled by this correction.
> > The rotation is carried out with a value except 0.
> 
> If you want it to be between [0, max-1],
> obviously that should be
>         while(max > 0 && sequence >= max) {
>                 sequence -= max;
>         }
> 
> Though I wonder why not simply:
>     if (max == 0)
>         return;
>     if (sequence > max)
>         sequence = 0;
> 
> 
> -- 
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
> 
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> 
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets

2012-01-15 Thread Andrew Beekhof
On Mon, Jan 16, 2012 at 6:38 AM, Florian Haas  wrote:
> On Sun, Jan 15, 2012 at 9:27 PM, Andrew Beekhof  wrote:
>> On Thu, Jan 12, 2012 at 11:01 PM, Florian Haas  wrote:
>>> On Thu, Jan 5, 2012 at 10:15 PM, Florian Haas  wrote:
 Florian Haas (2):
      extra: add rsyslog configuration snippet
      extra: add logrotate configuration snippet

  configure.ac                      |    4 +++
  extra/Makefile.am                 |    2 +-
  extra/logrotate/Makefile.am       |    5 
  extra/logrotate/pacemaker.conf.in |    7 ++
  extra/rsyslog/Makefile.am         |    5 
  extra/rsyslog/pacemaker.conf.in   |   39 
 +
  6 files changed, 61 insertions(+), 1 deletions(-)
  create mode 100644 extra/logrotate/Makefile.am
  create mode 100644 extra/logrotate/pacemaker.conf.in
  create mode 100644 extra/rsyslog/Makefile.am
  create mode 100644 extra/rsyslog/pacemaker.conf.in
>>>
>>> Any takers on these?
>>
>> Sorry, I was off working on the new fencing logic and then corosync
>> 2.0 support (when cman and all the plugins, including ours, go away).
>>
>> So a couple of comments...
>>
>> I fully agree that the state of our logging needs work and I can
>> understand people wanting to keep the vast majority of our logs out of
>> syslog.
>> I'm less thrilled about one-file-per-subsystem, the cluster will often
>> do a lot within a single second and splitting everything up really
>> hurts the ability to correlate messages.
>> I'd also suggest that /some/ information not coming directly from the
>> RAs is still appropriate for syslog (such as "I'm going to move A from
>> B to C" or "I'm about to turn of node D"), so the nuclear option isn't
>> really thrilling me.
>
> So everything that is logged by the RAs with ocf_log, as I wrote in
> the original post, _is_ still going to whatever the default syslog
> destination may be. The rsyslog config doesn't change that at all.

By "Nuclear", I meant nothing at all from Pacemaker.
If thats what you want, there's a far easier way to achieve this and
keep usable logs around for debugging, set facility to none and add a
logfile.

> (Stuff that the RAs simply barf out to stdout/err would go to the lrmd
> log.) I maintain that this is the stuff that is also most useful to
> people. And with just that information in the syslog, you usually get
> a pretty clear idea of what the heck the cluster is doing on a node,
> and in what order, in about 20 lines of logs close together -- rather
> than intermingled with potentially hundreds of lines of other
> cluster-related log output.

Did I not just finish agreeing that "hundreds of lines of other
cluster-related log[s]" was a problem?
I just don't think your knee-jerk "everything must go" approach is the answer.

>
> And disabling the "nuclear option" is a simple means of adding a "#"
> before "& ~" in the config file. You can ship it that way by default
> if you think that's more appropriate. That way, people would get the
> split-out logs _plus_ everything in one file, which IMHO is sometimes
> very useful for pengine or lrmd troubleshooting/debugging. I,
> personally, just don't want Pacemaker to flood my /var/log/messages,

Did you see me arguing against that?

> so I'd definitely leave the "& ~" in there, but that may be personal
> preference. I wonder what others think.
>
>> In addition to the above distractions, I've been coming up to speed on
>> libqb's logging which is opening up a lot of new doors and should
>> hopefully help solve the underlying log issues.
>> For starters it lets syslog/stderr/logfile all log at different levels
>> of verbosity (and formats), it also supports blackboxes of which a
>> dump can be triggered in response to an error condition or manually by
>> the admin.
>>
>> The plan is something along the lines of: syslog gets NOTICE and
>> above, anything else (depending on debug level and trace options) goes
>> to /var/log/(cluster/?)pacemaker or whatever was configured in
>> corosync.
>> However, before I can enact that there will need to be an audit of the
>> messages currently going to INFO (674 entries) and NOTICE(160 entries)
>> with some getting bumped up, others down (possibly even to debug).
>> I'd certainly be interested in feedback as to which logs should and
>> should not make it.
>
> Yes, even so, I (again, this is personal preference) would definitely
> not want pengine logging (which even if half its INFO messages get
> demoted to DEBUG, would still be pretty verbose) in my default
> messages file.

Sigh, please take time out from preaching to actually read the
replies.  You might learn something.
Your precious default messages file wouldn't be getting /any/ INFO
logs from pacemaker.

And I'm guessing your complaints are based on 1.0 logging too right?
Because for a long time now, the PE in 1.1 has only logged NOTICE and
above by default, which means about 1 additional line per resource +
error

Re: [Pacemaker] [Question] About the rotation of the pe-file.

2012-01-15 Thread Andrew Beekhof
On Mon, Jan 16, 2012 at 10:56 AM,   wrote:
> Hi Lars,
> Hi Andrew,
>
>> If you want it to be between [0, max-1],
>> obviously that should be
>>         while(max > 0 && sequence >= max) {
>>                 sequence -= max;
>>         }
>
> Thanks!!I try it.
>
>> Though I wonder why not simply:
>>     if (max == 0)
>>         return;
>>     if (sequence > max)
>>         sequence = 0;
>
> I wondered, too.
> However, I thought that a cord of Mr. Andrew might have some special 
> specifications.

I was probably trying to get too fancy when dealing with run-time
reduction of max.
Lets go with your way :-)

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] The attrd does not sometimes stop.

2012-01-15 Thread renayama19661014
Hi Lars,

Thank you for comments and suggestion.

> > poll([{fd=7, events=POLLIN|POLLPRI}, {fd=4, events=POLLIN|POLLPRI}, {fd=5, 
> > events=POLLIN|POLLPRI}], 3, -1
> 
> Note the -1 (infinity timeout!)
> 
> So even though the trigger was (presumably) set,
> and the ->prepare() should have returned true,
> the mainloop waits forever for "something" to happen on those file 
> descriptors.
> 
> 
> I suggest this:
> 
> crm_trigger_prepare should set *timeout = 0, if trigger is set.
> 
> Also think about this race: crm_trigger_prepare was already
> called, only then the signal came in...
> 
> diff --git a/lib/common/mainloop.c b/lib/common/mainloop.c
> index 2e8b1d0..fd17b87 100644
> --- a/lib/common/mainloop.c
> +++ b/lib/common/mainloop.c
> @@ -33,6 +33,13 @@ static gboolean
>  crm_trigger_prepare(GSource * source, gint * timeout)
>  {
>  crm_trigger_t *trig = (crm_trigger_t *) source;
> +/* Do not delay signal processing by the mainloop poll stage */
> +if (trig->trigger)
> +*timeout = 0;
> +/* To avoid races between signal delivery and the mainloop poll stage,
> + * make sure we always have a finite timeout. Unit: milliseconds. */
> +else
> +*timeout = 5000; /* arbitrary */
>  
>  return trig->trigger;
>  }
> 
> 
> This scenario does not let the blocked IPC off the hook, though.
> That is still possible, both for blocking send and blocking receive,
> so that should probably be fixed as well, somehow.
> I'm not sure how likely this "stuck in blocking IPC" is, though.

Including a correction of your suggestion, I continue investigating the problem 
again.

I report it if I get some information.

Best Regards,
Hideo Yamauchi.

--- On Sat, 2012/1/14, Lars Ellenberg  wrote:

> On Tue, Jan 10, 2012 at 04:43:51PM +0900, renayama19661...@ybb.ne.jp wrote:
> > Hi Lars,
> > 
> > I attach strace file when a problem reappeared at the end of last year.
> > I used glue which applied your patch for confirmation.
> > 
> > It is the file which I picked with attrd by strace -p command right before 
> > I stop Heartbeat.
> > 
> > Finally SIGTERM caught it, but attrd did not stop.
> > The attrd stopped afterwards when I sent SIGKILL.
> 
> The strace reveals something interesting:
> 
> This poll looks like the mainloop poll,
> but some ->prepare() has modified the timeout to be 0,
> so we proceed directly to ->check() and then ->dispatch().
> 
> > poll([{fd=7, events=POLLIN|POLLPRI}, {fd=4, events=POLLIN|POLLPRI}, {fd=8, 
> > events=POLLIN|POLLPRI}], 3, 0) = 1 ([{fd=8, revents=POLLIN|POLLHUP}])
> 
> > times({tms_utime=2, tms_stime=3, tms_cutime=0, tms_cstime=0}) = 433738632
> > recv(4, 0x95af308, 576, MSG_DONTWAIT)   = -1 EAGAIN (Resource temporarily 
> > unavailable)
> ...
> > recv(7, 0x95b1657, 3513, MSG_DONTWAIT)  = -1 EAGAIN (Resource temporarily 
> > unavailable)
> > poll([{fd=7, events=0}], 1, 0)          = ? ERESTART_RESTARTBLOCK (To be 
> > restarted)
> > --- SIGTERM (Terminated) @ 0 (0) ---
> > sigreturn()                             = ? (mask now [])
> 
> Ok. signal received, trigger set.
> Still finishing this mainloop iteration, though.
> 
> These recv(),poll() look like invocations of G_CH_prepare_int().
> Does not matter much, though.
> 
> > recv(7, 0x95b1657, 3513, MSG_DONTWAIT)  = -1 EAGAIN (Resource temporarily 
> > unavailable)
> > poll([{fd=7, events=0}], 1, 0)          = 0 (Timeout)
> > recv(7, 0x95b1657, 3513, MSG_DONTWAIT)  = -1 EAGAIN (Resource temporarily 
> > unavailable)
> > poll([{fd=7, events=0}], 1, 0)          = 0 (Timeout)
> 
> > times({tms_utime=2, tms_stime=3, tms_cutime=0, tms_cstime=0}) = 433738634
> 
> Now we proceed to the next mainloop poll:
> 
> > poll([{fd=7, events=POLLIN|POLLPRI}, {fd=4, events=POLLIN|POLLPRI}, {fd=5, 
> > events=POLLIN|POLLPRI}], 3, -1
> 
> Note the -1 (infinity timeout!)
> 
> So even though the trigger was (presumably) set,
> and the ->prepare() should have returned true,
> the mainloop waits forever for "something" to happen on those file 
> descriptors.
> 
> 
> I suggest this:
> 
> crm_trigger_prepare should set *timeout = 0, if trigger is set.
> 
> Also think about this race: crm_trigger_prepare was already
> called, only then the signal came in...
> 
> diff --git a/lib/common/mainloop.c b/lib/common/mainloop.c
> index 2e8b1d0..fd17b87 100644
> --- a/lib/common/mainloop.c
> +++ b/lib/common/mainloop.c
> @@ -33,6 +33,13 @@ static gboolean
>  crm_trigger_prepare(GSource * source, gint * timeout)
>  {
>      crm_trigger_t *trig = (crm_trigger_t *) source;
> +    /* Do not delay signal processing by the mainloop poll stage */
> +    if (trig->trigger)
> +        *timeout = 0;
> +    /* To avoid races between signal delivery and the mainloop poll stage,
> +     * make sure we always have a finite timeout. Unit: milliseconds. */
> +    else
> +        *timeout = 5000; /* arbitrary */
>  
>      return trig->trigger;
>  }
> 
> 
> This scenario does not let the blocked IPC off the hook, though.
> That is still possible, both for b

Re: [Pacemaker] [Question] About the rotation of the pe-file.

2012-01-15 Thread renayama19661014
Hi Andrew,
Hi Lars,

> >> If you want it to be between [0, max-1],
> >> obviously that should be
> >> while(max > 0 && sequence >= max) {
> >> sequence -= max;
> >> }

The rotation was carried out definitely from 0 to max-1.


> >> Though I wonder why not simply:
> >> if (max == 0)
> >> return;
> >> if (sequence > max)
> >> sequence = 0;

The rotation was carried out definitely from 0 to max.

> I was probably trying to get too fancy when dealing with run-time
> reduction of max.
> Lets go with your way :-)

I think that the application of the patch of the second simple correction is 
good.

Best Regards,
Hideo Yamauchi.



--- On Mon, 2012/1/16, Andrew Beekhof  wrote:

> On Mon, Jan 16, 2012 at 10:56 AM,   wrote:
> > Hi Lars,
> > Hi Andrew,
> >
> >> If you want it to be between [0, max-1],
> >> obviously that should be
> >>         while(max > 0 && sequence >= max) {
> >>                 sequence -= max;
> >>         }
> >
> > Thanks!!I try it.
> >
> >> Though I wonder why not simply:
> >>     if (max == 0)
> >>         return;
> >>     if (sequence > max)
> >>         sequence = 0;
> >
> > I wondered, too.
> > However, I thought that a cord of Mr. Andrew might have some special 
> > specifications.
> 
> I was probably trying to get too fancy when dealing with run-time
> reduction of max.
> Lets go with your way :-)
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets

2012-01-15 Thread Florian Haas
On Mon, Jan 16, 2012 at 10:59 AM, Andrew Beekhof  wrote:

> By "Nuclear", I meant nothing at all from Pacemaker.

Which is not what it does.

> If thats what you want, there's a far easier way to achieve this and
> keep usable logs around for debugging, set facility to none and add a
> logfile.

No, I don't want that.

>> (Stuff that the RAs simply barf out to stdout/err would go to the lrmd
>> log.) I maintain that this is the stuff that is also most useful to
>> people. And with just that information in the syslog, you usually get
>> a pretty clear idea of what the heck the cluster is doing on a node,
>> and in what order, in about 20 lines of logs close together -- rather
>> than intermingled with potentially hundreds of lines of other
>> cluster-related log output.
>
> Did I not just finish agreeing that "hundreds of lines of other
> cluster-related log[s]" was a problem?

What in my statement above indicates that I assumed otherwise?

> I just don't think your knee-jerk "everything must go" approach is the answer.

That is not my approach.

>> And disabling the "nuclear option" is a simple means of adding a "#"
>> before "& ~" in the config file. You can ship it that way by default
>> if you think that's more appropriate. That way, people would get the
>> split-out logs _plus_ everything in one file, which IMHO is sometimes
>> very useful for pengine or lrmd troubleshooting/debugging. I,
>> personally, just don't want Pacemaker to flood my /var/log/messages,
>
> Did you see me arguing against that?

No. What makes you think I did?

>> so I'd definitely leave the "& ~" in there, but that may be personal
>> preference. I wonder what others think.
>>
>>> In addition to the above distractions, I've been coming up to speed on
>>> libqb's logging which is opening up a lot of new doors and should
>>> hopefully help solve the underlying log issues.
>>> For starters it lets syslog/stderr/logfile all log at different levels
>>> of verbosity (and formats), it also supports blackboxes of which a
>>> dump can be triggered in response to an error condition or manually by
>>> the admin.
>>>
>>> The plan is something along the lines of: syslog gets NOTICE and
>>> above, anything else (depending on debug level and trace options) goes
>>> to /var/log/(cluster/?)pacemaker or whatever was configured in
>>> corosync.
>>> However, before I can enact that there will need to be an audit of the
>>> messages currently going to INFO (674 entries) and NOTICE(160 entries)
>>> with some getting bumped up, others down (possibly even to debug).
>>> I'd certainly be interested in feedback as to which logs should and
>>> should not make it.
>>
>> Yes, even so, I (again, this is personal preference) would definitely
>> not want pengine logging (which even if half its INFO messages get
>> demoted to DEBUG, would still be pretty verbose) in my default
>> messages file.
>
> Sigh, please take time out from preaching to actually read the
> replies.  You might learn something.

This is getting frustrating. Not this logging discussion, but pretty
much any discussion the two of us have been having lately. (And no,
this is not an assignment of guilt or responsibility -- it takes two
to tango.) Let's try and sort this out in person on Thursday.

Florian

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [PATCH 0/2] rsyslog/logrotate configuration snippets

2012-01-15 Thread Andrew Beekhof
On Mon, Jan 16, 2012 at 2:42 PM, Florian Haas  wrote:
> On Mon, Jan 16, 2012 at 10:59 AM, Andrew Beekhof  wrote:
>
>> By "Nuclear", I meant nothing at all from Pacemaker.
>
> Which is not what it does.

The daemons.  The RAs are not "from Pacemaker".
This is why I wrote in my first reply:

"/some/ information not coming directly from the
RAs is still appropriate for syslog (such as "I'm going to move A from
B to C" or "I'm about to turn of node D")"

Where in 
https://github.com/fghaas/pacemaker/blob/9e9bafd44971a8f4c3cd1de62fb2278fab28489e/extra/rsyslog/pacemaker.conf.in
does it allow any log from any daemon through?

>
>> If thats what you want, there's a far easier way to achieve this and
>> keep usable logs around for debugging, set facility to none and add a
>> logfile.
>
> No, I don't want that.

So one file per pacemaker daemon is central to your proposal?

>>> (Stuff that the RAs simply barf out to stdout/err would go to the lrmd
>>> log.) I maintain that this is the stuff that is also most useful to
>>> people. And with just that information in the syslog, you usually get
>>> a pretty clear idea of what the heck the cluster is doing on a node,
>>> and in what order, in about 20 lines of logs close together -- rather
>>> than intermingled with potentially hundreds of lines of other
>>> cluster-related log output.
>>
>> Did I not just finish agreeing that "hundreds of lines of other
>> cluster-related log[s]" was a problem?
>
> What in my statement above indicates that I assumed otherwise?

The part just above my reply where you're implying that the only
alternative to your idea of removing all daemon logging from syslog
was: resource logs "intermingled hundreds of lines of other
cluster-related log output".

>
>> I just don't think your knee-jerk "everything must go" approach is the 
>> answer.
>
> That is not my approach.
>
>>> And disabling the "nuclear option" is a simple means of adding a "#"
>>> before "& ~" in the config file. You can ship it that way by default
>>> if you think that's more appropriate. That way, people would get the
>>> split-out logs _plus_ everything in one file, which IMHO is sometimes
>>> very useful for pengine or lrmd troubleshooting/debugging. I,
>>> personally, just don't want Pacemaker to flood my /var/log/messages,
>>
>> Did you see me arguing against that?
>
> No. What makes you think I did?

Because you keep trying to tell me the current approach is wrong.
I know that, I said that, I just don't happen to think your idea is
the solution.

>
>>> so I'd definitely leave the "& ~" in there, but that may be personal
>>> preference. I wonder what others think.
>>>
 In addition to the above distractions, I've been coming up to speed on
 libqb's logging which is opening up a lot of new doors and should
 hopefully help solve the underlying log issues.
 For starters it lets syslog/stderr/logfile all log at different levels
 of verbosity (and formats), it also supports blackboxes of which a
 dump can be triggered in response to an error condition or manually by
 the admin.

 The plan is something along the lines of: syslog gets NOTICE and
 above, anything else (depending on debug level and trace options) goes
 to /var/log/(cluster/?)pacemaker or whatever was configured in
 corosync.
 However, before I can enact that there will need to be an audit of the
 messages currently going to INFO (674 entries) and NOTICE(160 entries)
 with some getting bumped up, others down (possibly even to debug).
 I'd certainly be interested in feedback as to which logs should and
 should not make it.
>>>
>>> Yes, even so, I (again, this is personal preference) would definitely
>>> not want pengine logging (which even if half its INFO messages get
>>> demoted to DEBUG, would still be pretty verbose) in my default
>>> messages file.
>>
>> Sigh, please take time out from preaching to actually read the
>> replies.  You might learn something.
>
> This is getting frustrating. Not this logging discussion, but pretty
> much any discussion the two of us have been having lately. (And no,
> this is not an assignment of guilt or responsibility -- it takes two
> to tango.) Let's try and sort this out in person on Thursday.
>
> Florian
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About the rotation of the pe-file.

2012-01-15 Thread Andrew Beekhof
On Mon, Jan 16, 2012 at 11:48 AM,   wrote:
> Hi Andrew,
> Hi Lars,
>
>> >> If you want it to be between [0, max-1],
>> >> obviously that should be
>> >>         while(max > 0 && sequence >= max) {
>> >>                 sequence -= max;
>> >>         }
>
> The rotation was carried out definitely from 0 to max-1.
>
>
>> >> Though I wonder why not simply:
>> >>     if (max == 0)
>> >>         return;
>> >>     if (sequence > max)
>> >>         sequence = 0;
>
> The rotation was carried out definitely from 0 to max.
>
>> I was probably trying to get too fancy when dealing with run-time
>> reduction of max.
>> Lets go with your way :-)
>
> I think that the application of the patch of the second simple correction is 
> good.

Its in my private tree so far:
  https://github.com/beekhof/pacemaker/commit/bfbb73c

It will make its way to clusterlabs when I merge next.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem] The attrd does not sometimes stop.

2012-01-15 Thread Andrew Beekhof
On Sun, Jan 15, 2012 at 1:57 AM, Lars Ellenberg
 wrote:
> On Tue, Jan 10, 2012 at 04:43:51PM +0900, renayama19661...@ybb.ne.jp wrote:
>> Hi Lars,
>>
>> I attach strace file when a problem reappeared at the end of last year.
>> I used glue which applied your patch for confirmation.
>>
>> It is the file which I picked with attrd by strace -p command right before I 
>> stop Heartbeat.
>>
>> Finally SIGTERM caught it, but attrd did not stop.
>> The attrd stopped afterwards when I sent SIGKILL.
>
> The strace reveals something interesting:
>
> This poll looks like the mainloop poll,
> but some ->prepare() has modified the timeout to be 0,
> so we proceed directly to ->check() and then ->dispatch().
>
>> poll([{fd=7, events=POLLIN|POLLPRI}, {fd=4, events=POLLIN|POLLPRI}, {fd=8, 
>> events=POLLIN|POLLPRI}], 3, 0) = 1 ([{fd=8, revents=POLLIN|POLLHUP}])
>
>> times({tms_utime=2, tms_stime=3, tms_cutime=0, tms_cstime=0}) = 433738632
>> recv(4, 0x95af308, 576, MSG_DONTWAIT)   = -1 EAGAIN (Resource temporarily 
>> unavailable)
> ...
>> recv(7, 0x95b1657, 3513, MSG_DONTWAIT)  = -1 EAGAIN (Resource temporarily 
>> unavailable)
>> poll([{fd=7, events=0}], 1, 0)          = ? ERESTART_RESTARTBLOCK (To be 
>> restarted)
>> --- SIGTERM (Terminated) @ 0 (0) ---
>> sigreturn()                             = ? (mask now [])
>
> Ok. signal received, trigger set.
> Still finishing this mainloop iteration, though.
>
> These recv(),poll() look like invocations of G_CH_prepare_int().
> Does not matter much, though.
>
>> recv(7, 0x95b1657, 3513, MSG_DONTWAIT)  = -1 EAGAIN (Resource temporarily 
>> unavailable)
>> poll([{fd=7, events=0}], 1, 0)          = 0 (Timeout)
>> recv(7, 0x95b1657, 3513, MSG_DONTWAIT)  = -1 EAGAIN (Resource temporarily 
>> unavailable)
>> poll([{fd=7, events=0}], 1, 0)          = 0 (Timeout)
>
>> times({tms_utime=2, tms_stime=3, tms_cutime=0, tms_cstime=0}) = 433738634
>
> Now we proceed to the next mainloop poll:
>
>> poll([{fd=7, events=POLLIN|POLLPRI}, {fd=4, events=POLLIN|POLLPRI}, {fd=5, 
>> events=POLLIN|POLLPRI}], 3, -1
>
> Note the -1 (infinity timeout!)
>
> So even though the trigger was (presumably) set,
> and the ->prepare() should have returned true,
> the mainloop waits forever for "something" to happen on those file 
> descriptors.
>
>
> I suggest this:
>
> crm_trigger_prepare should set *timeout = 0, if trigger is set.
>
> Also think about this race: crm_trigger_prepare was already
> called, only then the signal came in...
>
> diff --git a/lib/common/mainloop.c b/lib/common/mainloop.c
> index 2e8b1d0..fd17b87 100644
> --- a/lib/common/mainloop.c
> +++ b/lib/common/mainloop.c
> @@ -33,6 +33,13 @@ static gboolean
>  crm_trigger_prepare(GSource * source, gint * timeout)
>  {
>     crm_trigger_t *trig = (crm_trigger_t *) source;
> +    /* Do not delay signal processing by the mainloop poll stage */
> +    if (trig->trigger)
> +           *timeout = 0;
> +    /* To avoid races between signal delivery and the mainloop poll stage,
> +     * make sure we always have a finite timeout. Unit: milliseconds. */
> +    else
> +           *timeout = 5000; /* arbitrary */
>
>     return trig->trigger;
>  }
>
>
> This scenario does not let the blocked IPC off the hook, though.
> That is still possible, both for blocking send and blocking receive,
> so that should probably be fixed as well, somehow.
> I'm not sure how likely this "stuck in blocking IPC" is, though.

Interesting, are you sure you're in the right function though?
trigger and signal events don't have a file descriptor... wouldn't
these polls be for the IPC related sources and wouldn't they be
setting their own timeout?

>
> --
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
>
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] need cluster-wide variables

2012-01-15 Thread Andrew Beekhof
On Wed, Jan 11, 2012 at 8:24 AM, Arnold Krille  wrote:
> On Tuesday 10 January 2012 20:08:50 Dejan Muhamedagic wrote:
>> On Thu, Jan 05, 2012 at 04:59:13AM +, shashi wrote:
>> > So we have two probable options:
>> > 1. An work around to achieve this tri-state efficiently without changing
>> > pacemaker internals.
>> > 2. Modify pacemaker to add this tri-state feature.
>> This sounds too complex to me. And it won't scale (if there are
>> too many Pre-Masters you may want to introduce Pre-Pre-Masters).
>> How about moving the logic to slaves, i.e. that they figure out
>> themselves which master to connect to? This is of course
>> oversimplification, but I'd rather try that way.
>
> Is it possible for slaves to modify their score for promotion? I think that
> would be an interesting feature.
>
> Probably something like that could already be achieved with dependency-rules
> and variables. But I think a function for resource agents to increase or
> decrease the score would be more clean.

This is what crm_master is designed for.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] SBD stonith issues in RHEL cluster

2012-01-15 Thread Andrew Beekhof
On Mon, Jan 9, 2012 at 7:30 PM, Qiu Zhigang  wrote:
> Hi, All
>
>
>
> I want to use SBD device as a stonith device in RHCS, but how could I
> configure sbd resource agent?

I don't think you can, it doesn't ship at part of RHCS at this stage.
No idea if that will change.

You'd have to build and install it yourself.

>
>
>
> I use the following command,
>
>
>
> primitive sbd_fence stonith:external/sbd params
> sbd_device="/dev/disk/by-id/scsi-3300035230a3a"
>
>
>
> but a error occurred,
>
>
>
> ERROR: sbd_fence: parameter sbd_device does not exist
>
>
>
> I want to confirm whether I could use the sbd stonith device in RHCS , and
> how should I configure the resource and parameter corresponding?
>
>
>
>
>
> Best Regards,
>
> Qiu Zhigang
>
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Question] About the rotation of the pe-file.

2012-01-15 Thread renayama19661014
Hi Andrew,
Hi Lars,

> Its in my private tree so far:
>   https://github.com/beekhof/pacemaker/commit/bfbb73c
> 
> It will make its way to clusterlabs when I merge next.

All right!

Many Thanks!

Hideo Yamauchi.

--- On Mon, 2012/1/16, Andrew Beekhof  wrote:

> On Mon, Jan 16, 2012 at 11:48 AM,   wrote:
> > Hi Andrew,
> > Hi Lars,
> >
> >> >> If you want it to be between [0, max-1],
> >> >> obviously that should be
> >> >>         while(max > 0 && sequence >= max) {
> >> >>                 sequence -= max;
> >> >>         }
> >
> > The rotation was carried out definitely from 0 to max-1.
> >
> >
> >> >> Though I wonder why not simply:
> >> >>     if (max == 0)
> >> >>         return;
> >> >>     if (sequence > max)
> >> >>         sequence = 0;
> >
> > The rotation was carried out definitely from 0 to max.
> >
> >> I was probably trying to get too fancy when dealing with run-time
> >> reduction of max.
> >> Lets go with your way :-)
> >
> > I think that the application of the patch of the second simple correction 
> > is good.
> 
> Its in my private tree so far:
>   https://github.com/beekhof/pacemaker/commit/bfbb73c
> 
> It will make its way to clusterlabs when I merge next.
> 

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2012-01-15 Thread Andrew Beekhof
Sorry for not getting to this earlier...

On Mon, Dec 19, 2011 at 10:39 PM, Vladislav Bogdanov
 wrote:
> 09.12.2011 08:44, Andrew Beekhof wrote:
>> On Fri, Dec 9, 2011 at 3:16 PM, Vladislav Bogdanov  
>> wrote:
>>> 09.12.2011 03:11, Andrew Beekhof wrote:
 On Fri, Dec 2, 2011 at 1:32 AM, Vladislav Bogdanov  
 wrote:
> Hi Andrew,
>
> I investigated on my test cluster what actually happens with dlm and
> fencing.
>
> I added more debug messages to dlm dump, and also did a re-kick of nodes
> after some time.
>
> Results are that stonith history actually doesn't contain any
> information until pacemaker decides to fence node itself.

 ...

> From my PoV that means that the call to
> crm_terminate_member_no_mainloop() does not actually schedule fencing
> operation.

 You're going to have to remind me... what does your copy of
 crm_terminate_member_no_mainloop() look like?
 This is with the non-cman editions of the controlds too right?
>>>
>>> Just latest github's version. You changed some dlm_controld.pcmk
>>> functionality, so it asks stonithd for fencing results instead of XML
>>> magic. But call to crm_terminate_member_no_mainloop() remains the same
>>> there. But yes, that version communicates stonithd directly too.
>>>
>>> SO, the problem here is just with crm_terminate_member_no_mainloop()
>>> which for some reason skips actual fencing request.
>>
>> There should be some logs, either indicating that it tried, or that it 
>> failed.
>
> Nothing about fencing.
> Only messages about history requests:
>
> stonith-ng: [1905]: info: stonith_command: Processed st_fence_history
> from cluster-dlm: rc=0

The logs would be from the dlm, since thats who's calling
crm_terminate_member_no_mainloop().

>
> I even moved all fencing code to dlm_controld to have better control on
> what does it do (and not to rebuild pacemaker to play with that code).
> dlm_tool dump prints the same line every second, stonith-ng prints
> history requests.
>
> A little bit odd, by I saw one time that fencing request from
> cluster-dlm succeeded, but only right after node was fenced by
> pacemaker. As a result, node was switched off instead of reboot.
>
> That raises one more question: is it correct to call st->cmds->fence()
> with third parameter set to "off"?
> I think that "reboot" is more consistent with the rest of fencing subsystem.

Either is legitimate.

>
> At the same time, stonith_admin -B succeeds.
> The main difference I see is st_opt_sync_call in a latter case.
> Will try to experiment with it.

/Shouldn't/ matter.

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2012-01-15 Thread Andrew Beekhof
On Mon, Dec 19, 2011 at 11:11 PM, Vladislav Bogdanov
 wrote:
> 19.12.2011 14:39, Vladislav Bogdanov wrote:
>> 09.12.2011 08:44, Andrew Beekhof wrote:
>>> On Fri, Dec 9, 2011 at 3:16 PM, Vladislav Bogdanov  
>>> wrote:
 09.12.2011 03:11, Andrew Beekhof wrote:
> On Fri, Dec 2, 2011 at 1:32 AM, Vladislav Bogdanov  
> wrote:
>> Hi Andrew,
>>
>> I investigated on my test cluster what actually happens with dlm and
>> fencing.
>>
>> I added more debug messages to dlm dump, and also did a re-kick of nodes
>> after some time.
>>
>> Results are that stonith history actually doesn't contain any
>> information until pacemaker decides to fence node itself.
>
> ...
>
>> From my PoV that means that the call to
>> crm_terminate_member_no_mainloop() does not actually schedule fencing
>> operation.
>
> You're going to have to remind me... what does your copy of
> crm_terminate_member_no_mainloop() look like?
> This is with the non-cman editions of the controlds too right?

 Just latest github's version. You changed some dlm_controld.pcmk
 functionality, so it asks stonithd for fencing results instead of XML
 magic. But call to crm_terminate_member_no_mainloop() remains the same
 there. But yes, that version communicates stonithd directly too.

 SO, the problem here is just with crm_terminate_member_no_mainloop()
 which for some reason skips actual fencing request.
>>>
>>> There should be some logs, either indicating that it tried, or that it 
>>> failed.
>>
>> Nothing about fencing.
>> Only messages about history requests:
>>
>> stonith-ng: [1905]: info: stonith_command: Processed st_fence_history
>> from cluster-dlm: rc=0
>>
>> I even moved all fencing code to dlm_controld to have better control on
>> what does it do (and not to rebuild pacemaker to play with that code).
>> dlm_tool dump prints the same line every second, stonith-ng prints
>> history requests.
>>
>> A little bit odd, by I saw one time that fencing request from
>> cluster-dlm succeeded, but only right after node was fenced by
>> pacemaker. As a result, node was switched off instead of reboot.
>>
>> That raises one more question: is it correct to call st->cmds->fence()
>> with third parameter set to "off"?
>> I think that "reboot" is more consistent with the rest of fencing subsystem.
>>
>> At the same time, stonith_admin -B succeeds.
>> The main difference I see is st_opt_sync_call in a latter case.
>> Will try to experiment with it.
>
> Ys!!!
>
> Now I see following:
> Dec 19 11:53:34 vd01-a cluster-dlm: [2474]: info:
> pacemaker_terminate_member: Requesting that node 1090782474/vd01-b be fenced

So the important question... what did you change?

> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info:
> initiate_remote_stonith_op: Initiating remote operation reboot for
> vd01-b: 21425fc0-4311-40fa-9647-525c3f258471
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
> vd01-c now has id: 1107559690
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
> Processed st_query from vd01-c: rc=0
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
> vd01-d now has id: 1124336906
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
> Processed st_query from vd01-d: rc=0
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
> Processed st_query from vd01-a: rc=0
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: call_remote_stonith:
> Requesting that vd01-c perform op reboot vd01-b
> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
> vd01-b now has id: 1090782474
> ...
> Dec 19 11:53:40 vd01-a stonith-ng: [1905]: info: stonith_command:
> Processed st_fence_history from cluster-dlm: rc=0
> Dec 19 11:53:40 vd01-a crmd: [1910]: info: tengine_stonith_notify: Peer
> vd01-b was terminated (reboot) by vd01-c for vd01-a
> (ref=21425fc0-4311-40fa-9647-525c3f258471): OK
>
> But, then I see minor issue that node is marked to be fenced again:
> Dec 19 11:53:40 vd01-a pengine: [1909]: WARN: pe_fence_node: Node vd01-b
> will be fenced because it is un-expectedly down

Do you have logs for that?
tengine_stonith_notify() got called, that should have been enough to
get the node cleaned up in the cib.

> ...
> Dec 19 11:53:40 vd01-a pengine: [1909]: WARN: stage6: Scheduling Node
> vd01-b for STONITH
> ...
> Dec 19 11:53:40 vd01-a crmd: [1910]: info: te_fence_node: Executing
> reboot fencing operation (249) on vd01-b (timeout=6)
> ...
> Dec 19 11:53:40 vd01-a stonith-ng: [1905]: info: call_remote_stonith:
> Requesting that vd01-c perform op reboot vd01-b
>
> And so on.
>
> I can't investigated this one in more depth, because I use fence_xvm in
> this testing cluster, and it has issues when running more than one
> stonith resource on a node. Also, my RA (in a cluster where this testing
> cluster runs) undefines VM after failure, so fence_xvm does not see
> fenc

Re: [Pacemaker] [Problem]It is judged that a stopping resource is starting.

2012-01-15 Thread Andrew Beekhof
On Fri, Jan 6, 2012 at 12:37 PM,   wrote:
> Hi Andrew,
>
> Thank you for comment.
>
>> But it should have a subsequent stop action which would set it back to
>> being inactive.
>> Did that not happen in this case?
>
> Yes.

Could you send me the PE file related to this log please?

Jan  6 19:22:01 rh57-1 crmd: [3461]: info: do_te_invoke: Processing
graph 4 (ref=pe_calc-dc-1325845321-26) derived from
/var/lib/pengine/pe-input-4.bz2



> Log of "verify_stopped" is only recorded.
> The stop handling of resource that failed in probe was not carried out.
>
> -
> # yamauchi PREV STOP ##
> Jan  6 19:21:56 rh57-1 heartbeat: [3443]: info: killing 
> /usr/lib64/heartbeat/ifcheckd process group 3462 with signal 15
> Jan  6 19:21:56 rh57-1 ifcheckd: [3462]: info: crm_signal_dispatch: Invoking 
> handler for signal 15: Terminated
> Jan  6 19:21:56 rh57-1 ifcheckd: [3462]: info: do_node_walk: Requesting the 
> list of configured nodes
> Jan  6 19:21:58 rh57-1 ifcheckd: [3462]: info: main: Exiting ifcheckd
> Jan  6 19:21:58 rh57-1 heartbeat: [3443]: info: killing 
> /usr/lib64/heartbeat/crmd process group 3461 with signal 15
> Jan  6 19:21:58 rh57-1 crmd: [3461]: info: crm_signal_dispatch: Invoking 
> handler for signal 15: Terminated
> Jan  6 19:21:58 rh57-1 crmd: [3461]: info: crm_shutdown: Requesting shutdown
> Jan  6 19:21:58 rh57-1 crmd: [3461]: info: do_state_transition: State 
> transition S_IDLE -> S_POLICY_ENGINE [ input=I_SHUTDOWN cause=C_SHUTDOWN 
> origin=crm_shutdown ]
> Jan  6 19:21:58 rh57-1 crmd: [3461]: info: do_state_transition: All 1 cluster 
> nodes are eligible to run resources.
> Jan  6 19:21:58 rh57-1 crmd: [3461]: info: do_shutdown_req: Sending shutdown 
> request to DC: rh57-1
> Jan  6 19:21:59 rh57-1 crmd: [3461]: info: handle_shutdown_request: Creating 
> shutdown request for rh57-1 (state=S_POLICY_ENGINE)
> Jan  6 19:21:59 rh57-1 attrd: [3460]: info: attrd_trigger_update: Sending 
> flush op to all hosts for: shutdown (1325845319)
> Jan  6 19:21:59 rh57-1 attrd: [3460]: info: attrd_perform_update: Sent update 
> 14: shutdown=1325845319
> Jan  6 19:21:59 rh57-1 crmd: [3461]: info: abort_transition_graph: 
> te_update_diff:150 - Triggered transition abort (complete=1, tag=nvpair, 
> id=status-1fdd5e2a-44b6-44b9-9993-97fa120072a4-shutdown, name=shutdown, 
> value=1325845319, magic=NA, cib=0.101.16) : Transient attribute: update
> Jan  6 19:22:01 rh57-1 crmd: [3461]: info: crm_timer_popped: New Transition 
> Timer (I_PE_CALC) just popped!
> Jan  6 19:22:01 rh57-1 crmd: [3461]: info: do_pe_invoke: Query 44: Requesting 
> the current CIB: S_POLICY_ENGINE
> Jan  6 19:22:01 rh57-1 crmd: [3461]: info: do_pe_invoke_callback: Invoking 
> the PE: query=44, ref=pe_calc-dc-1325845321-26, seq=1, quorate=0
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: unpack_config: On loss of CCM 
> Quorum: Ignore
> Jan  6 19:22:01 rh57-1 pengine: [3464]: info: unpack_config: Node scores: 
> 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
> Jan  6 19:22:01 rh57-1 pengine: [3464]: WARN: unpack_nodes: Blind faith: not 
> fencing unseen nodes
> Jan  6 19:22:01 rh57-1 pengine: [3464]: info: determine_online_status: Node 
> rh57-1 is shutting down
> Jan  6 19:22:01 rh57-1 pengine: [3464]: ERROR: unpack_rsc_op: Hard error - 
> prmVIP_monitor_0 failed with rc=6: Preventing prmVIP from re-starting 
> anywhere in the cluster
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: group_print:  Resource Group: 
> grpUltraMonkey
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: native_print:      prmVIP     
>   (ocf::heartbeat:LVM):   Stopped
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: group_print:  Resource Group: 
> grpStonith1
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: native_print:      
> prmStonith1-2        (stonith:external/ssh): Stopped
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: native_print:      
> prmStonith1-3        (stonith:meatware):     Stopped
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: group_print:  Resource Group: 
> grpStonith2
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: native_print:      
> prmStonith2-2        (stonith:external/ssh): Started rh57-1
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: native_print:      
> prmStonith2-3        (stonith:meatware):     Started rh57-1
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: clone_print:  Clone Set: 
> clnPingd
> Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: short_print:      Started: [ 
> rh57-1 ]
> Jan  6 19:22:01 rh57-1 pengine: [3464]: info: rsc_merge_weights: clnPingd: 
> Rolling back scores from prmVIP
> Jan  6 19:22:01 rh57-1 pengine: [3464]: info: native_color: Resource 
> prmPingd:0 cannot run anywhere
> Jan  6 19:22:01 rh57-1 pengine: [3464]: info: native_color: Resource prmVIP 
> cannot run anywhere
> Jan  6 19:22:01 rh57-1 pengine: [3464]: info: rsc_merge_weights: 
> prmStonith1-2: Rolling back scores from prmStonith1-3
> Jan  6 19:22:01 rh57-1 pengine: [3464]: info: native_c

Re: [Pacemaker] cman multi-homed with udp-broadcast issues

2012-01-15 Thread Andrew Beekhof
This is getting into some pretty specific cman knowledge, you might
find more experts on that at linux-clus...@redhat.com

On Sat, Jan 7, 2012 at 4:00 AM, Patrick H.  wrote:
> So I'm trying to setup a cluster with a secondary communication ring in case
> the first ring fails. The cluster operates fine, but doesnt seem to handle
> path failure properly. When I break the path between the 2 nodes on ring 1,
> I get the following in the logs:
>
> Jan  6 16:55:17 syslog02.cms.usa.net corosync[13931]:   [TOTEM ]
> Incrementing problem counter for seqid 202 iface 165.212.15.49 to [1 of 3]
> Jan  6 16:55:19 syslog02.cms.usa.net corosync[13931]:   [TOTEM ] ring 1
> active with no faults
> Jan  6 16:55:24 syslog02.cms.usa.net corosync[13931]:   [TOTEM ]
> Incrementing problem counter for seqid 204 iface 165.212.15.49 to [1 of 3]
> Jan  6 16:55:26 syslog02.cms.usa.net corosync[13931]:   [TOTEM ] ring 1
> active with no faults
> Jan  6 16:55:30 syslog02.cms.usa.net corosync[13931]:   [TOTEM ]
> Incrementing problem counter for seqid 206 iface 165.212.15.49 to [1 of 3]
> Jan  6 16:55:32 syslog02.cms.usa.net corosync[13931]:   [TOTEM ] ring 1
> active with no faults
>
> And it just repeats over and over. From notes I've found from others, it
> appears this might be because of each ring sharing the same broadcast
> address. Indeed this is the case as `cman_tool status` shows
> Multicast addresses: 255.255.255.255 255.255.255.255
> Node addresses: 165.212.64.49 165.212.15.49
>
> However I've tried changing this address in the cluster.conf and it seems to
> be completely ignored. I've also tried changing the port for the second ring
> and thats also ignored (tcpdump shows them still going to the same port as
> ring 0).
>
> So, is this indeed the cause of it not properly detecting ring failure? And
> if so, how can I fix it?
>
>
> cluster.conf:
> 
> 
>     
>     
>     
>     
>         
>              />
>             
>                 
>                     
>                 
>             
>         
>         
>              />
>             
>                 
>                     
>                 
>             
>         
>     
>     
>         
>     
> 
>
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] Two-node cluster, able to function with only a single-node?

2012-01-15 Thread Andrew Beekhof
On Tue, Dec 13, 2011 at 9:40 AM, Reid, Mike  wrote:
> I've been doing a lot of testing while I've been learning
> Pacemaker/Heartbeat, DRBD, OCFS2, etc (Nginx, PHP-FPM, Apache…)  Below I've
> attached my working CIB, but I worry I am missing something obvious (or
> perhaps going about it incorrectly altogether): It seems I cannot start the
> cluster as a single node, and sometimes if the other node is having issues,
> the functioning node does not mount the file system, etc….…until I've set
> the other node to "Standby", restart, etc. Shouldn't I be able to run on a
> single node with HA? I presume I've not configured something correctly,
> possibly I'm using "clone" incorrectly based on my needs?

This looks suspicious:
   primitive resST-NULL stonith:null

We're not going to do anything until we've fenced the second node.
So depending on what this does, it might explain what you're seeing.
>
> Basically, for this two-node web server cluster, I'm running a OCFS2
> Primary/Primary DRBD, with Nginx + PHP-FPM / Apache. I need a little help
> confirming if the CIB configuration below is the appropriate way to handle
> this: As long as one of the node's DRBD is UpToDate (Primary), then ideally
> I'd be able to mount the OCFS2 Filesystem (resFS), start PHP-FPM (resPHP),
> start NGINX (resPROXY), and then Apache (resAPACHE) and handle web traffic.
> Don't worry, not storing any access logs or sessions in DRBD, I'm using
> local disk + memcached for that.
>
> Is my usage of clone / order correct here, or is that perhaps what's
> blocking me from running on a single node more reliably? …I've been able to
> do it by going from a working cluster to setting one node the "standby" or
> simulating a power fault, however I haven't been able to simply start a
> single node and have work as described above.
>
> Software:
> OS: Ubuntu 10.10 (Maverick) / Kernel: 2.6.35
> Corosync 1.2.1
> DRBD 8.3.10
> OCFS2: v1.5.0
>
>
> Pacemakr config / CIB:
>
> node MACHINE1 \
>         attributes standby="off"
> node MACHINE2 \
>         attributes standby="off"
> primitive resAPACHE ocf:heartbeat:apache \
>         params configfile="/usr/local/apache/conf/httpd.conf" \
>         op monitor interval="1min" \
>         op start interval="0" timeout="40" \
>         op stop interval="0" timeout="60"
> primitive resDLM ocf:pacemaker:controld \
>         op monitor interval="120s"
> primitive resDRBD ocf:linbit:drbd \
>         params drbd_resource="repdata" \
>         operations $id="resDRBD-operations" \
>         op monitor interval="20s" role="Master" timeout="120s" \
>         op monitor interval="30s" role="Slave" timeout="120s"
> primitive resFS ocf:heartbeat:Filesystem \
>         params device="/dev/drbd/by-res/repdata" directory="/data"
> fstype="ocfs2" \
>         op monitor interval="120s"
> primitive resO2CB ocf:pacemaker:o2cb \
>         op monitor interval="120s"
> primitive resPHP ocf:heartbeat:anything \
>         params binfile="/usr/local/sbin/php-fpm"
> cmdline_options="--fpm-config /usr/local/etc/php-fpm.conf"
> pidfile="/var/run/php-fpm.pid" \
>         op start interval="0" timeout="20" \
>         op stop interval="0" timeout="30" \
>         op monitor interval="20" \
>         meta target-role="Started"
> primitive resPROXY ocf:heartbeat:nginx \
>         params conffile="/etc/nginx/nginx.conf" \
>         op monitor interval="60s" \
>         op start interval="0" timeout="40" \
>         op stop interval="0" timeout="60"
> primitive resST-NULL stonith:null \
>         params hostlist="MACHINE1 MACHINE2"
> ms msDRBD resDRBD \
>         meta resource-stickines="100" notify="true" master-max="2"
> interleave="true"
> clone cloneDLM resDLM \
>         meta globally-unique="false" interleave="true"
> clone cloneFS resFS \
>         meta interleave="true" ordered="true" target-role="Started"
> clone cloneHTTPD resAPACHE \
>         meta globally-unique="false" interleave="true" ordered="true"
> target-role="Started"
> clone cloneO2CB resO2CB \
>         meta globally-unique="false" interleave="true"
> clone clonePHP resPHP \
>         meta globally-unique="false" interleave="true" ordered="true"
> target-role="Started"
> clone clonePROXY resPROXY \
>         meta globally-unique="false" interleave="true" ordered="true"
> target-role="Started"
> clone fencing resST-NULL
> colocation colDLMDRBD inf: cloneDLM msDRBD:Master
> colocation colFSO2CB inf: cloneFS cloneO2CB
> colocation colO2CBDLM inf: cloneO2CB cloneDLM
> order ordDLMO2CB inf: cloneDLM cloneO2CB
> order ordDRBDDLM inf: msDRBD:promote cloneDLM
> order ordO2CBFS inf: cloneO2CB cloneFS
> order ordPHP inf: cloneFS clonePHP
> order ordPROXY inf: clonePHP clonePROXY
> order ordWEB inf: cloneFS cloneHTTPD
> property $id="cib-bootstrap-options" \
>         dc-version="1.0.9-unknown" \
>         cluster-infrastructure="openais" \
>         stonith-enabled="true" \
>         no-quorum-policy="ignore" \
>         expected-quorum-votes="2"
>
>
>
>
> NOTE: Please disrega

Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2012-01-15 Thread Vladislav Bogdanov
16.01.2012 09:20, Andrew Beekhof wrote:
[snip]
>>> At the same time, stonith_admin -B succeeds.
>>> The main difference I see is st_opt_sync_call in a latter case.
>>> Will try to experiment with it.
>>
>> Ys!!!
>>
>> Now I see following:
>> Dec 19 11:53:34 vd01-a cluster-dlm: [2474]: info:
>> pacemaker_terminate_member: Requesting that node 1090782474/vd01-b be fenced
> 
> So the important question... what did you change?

Nice you're back ;)

+ rc = st->cmds->fence(st, *st_opt_sync_call*, node_uname, "reboot", 120);

attaching my resulting version of pacemaker.c (which still has a lot of
mess because of different approaches I tried to get the result and needs
a cleanup). Function you may look at is pacemaker_terminate_member()
which is almost one-to-one copy of crm_terminate_member_no_mainloop()
except rename of variable to compile without warnings and change of
->fence() arguments.

> 
>> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info:
>> initiate_remote_stonith_op: Initiating remote operation reboot for
>> vd01-b: 21425fc0-4311-40fa-9647-525c3f258471
>> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
>> vd01-c now has id: 1107559690
>> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
>> Processed st_query from vd01-c: rc=0
>> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
>> vd01-d now has id: 1124336906
>> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
>> Processed st_query from vd01-d: rc=0
>> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: stonith_command:
>> Processed st_query from vd01-a: rc=0
>> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: call_remote_stonith:
>> Requesting that vd01-c perform op reboot vd01-b
>> Dec 19 11:53:34 vd01-a stonith-ng: [1905]: info: crm_get_peer: Node
>> vd01-b now has id: 1090782474
>> ...
>> Dec 19 11:53:40 vd01-a stonith-ng: [1905]: info: stonith_command:
>> Processed st_fence_history from cluster-dlm: rc=0
>> Dec 19 11:53:40 vd01-a crmd: [1910]: info: tengine_stonith_notify: Peer
>> vd01-b was terminated (reboot) by vd01-c for vd01-a
>> (ref=21425fc0-4311-40fa-9647-525c3f258471): OK
>>
>> But, then I see minor issue that node is marked to be fenced again:
>> Dec 19 11:53:40 vd01-a pengine: [1909]: WARN: pe_fence_node: Node vd01-b
>> will be fenced because it is un-expectedly down
> 
> Do you have logs for that?
> tengine_stonith_notify() got called, that should have been enough to
> get the node cleaned up in the cib.

Ugh, seems like yes, but they are archived already. Will get them back
to nodes and try to compose hb_report for them (but pe inputs are
already lost, do you still need logs without them?)

> 
>> ...
>> Dec 19 11:53:40 vd01-a pengine: [1909]: WARN: stage6: Scheduling Node
>> vd01-b for STONITH
>> ...
>> Dec 19 11:53:40 vd01-a crmd: [1910]: info: te_fence_node: Executing
>> reboot fencing operation (249) on vd01-b (timeout=6)
>> ...
>> Dec 19 11:53:40 vd01-a stonith-ng: [1905]: info: call_remote_stonith:
>> Requesting that vd01-c perform op reboot vd01-b
>>
>> And so on.
>>
>> I can't investigated this one in more depth, because I use fence_xvm in
>> this testing cluster, and it has issues when running more than one
>> stonith resource on a node. Also, my RA (in a cluster where this testing
>> cluster runs) undefines VM after failure, so fence_xvm does not see
>> fencing victim in a qpid and is unable to fence it again.
>>
>> May be it is possible to look if node was just fenced and skip redundant
>> fencing?
> 
> If the callbacks are being used correctly, it shouldn't be required
#include 

#include "config.h"
#include "dlm_daemon.h"

#include 
#include 
#include 

#include 

#include 
#include 
#include 
/* heartbeat support is irrelevant here */
#undef SUPPORT_HEARTBEAT 
#define SUPPORT_HEARTBEAT 0
#include 
#include 
#include 
#include 
#include 
#include 

#define COMMS_DIR "/sys/kernel/config/dlm/cluster/comms"

int setup_ccs(void)
{
/* To avoid creating an additional place for the dlm to be configured,
 * only allow configuration from the command-line until CoroSync is stable
 * enough to be used with Pacemaker
 */
cfgd_groupd_compat = 0; /* always use libcpg and disable backward compat */
return 0;
}

void close_ccs(void) { return; }
int get_weight(int nodeid, char *lockspace) { return 1; }

/* TODO: Make this configurable
 * Can't use logging.c as-is as whitetank exposes a different logging API
 */
void init_logging(void) {
openlog("cluster-dlm", LOG_PERROR|LOG_PID|LOG_CONS|LOG_NDELAY, LOG_DAEMON);
/* cl_log_enable_stderr(TRUE); */
}

void setup_logging(void) { return; }
void close_logging(void) {
closelog();
}

extern int ais_fd_async;

char *local_node_uname = NULL;
void dlm_process_node(gpointer key, gpointer value, gpointer user_data);

int setup_cluster(void)
{
ais_fd_async = -1;
crm_log_init("cluster-dlm", LOG_INFO, FALSE, TRUE, 0, NULL);

if(init_ais_connection(NULL, NULL, NULL, &local_node

Re: [Pacemaker] large cluster design questions

2012-01-15 Thread Andrew Beekhof
On Fri, Jan 6, 2012 at 10:10 PM, Christian Parpart  wrote:
> Hey all,
>
> I am also about to evaluate whether or not Pacemaker+Corosync is the
> way to go for our
> infrastructure.
>
> We are currently having about 45 physical nodes (plus about 60 more
> virtual containers)
> with a statically historically grown setup of services.

You should be able to get totem (corosync's membership algorithm) to
scale to 32 nodes, but it will need some tweaking of the timing
parameters.

>
> I am now to restructure this historically grown system into something
> clean and well
> maintainable with HA and scalability in mind (there is no hurry, we've
> some time to design it).
>
> So here is what we mainly have or will have:
>
> -> HAproxy (tcp/80, tcp/443, master + (hot) failover)
> -> http frontend server(s) (doing SSL and static files, in case of
> performance issues -> clone resource).
> -> Varnish (backend accelerator)
> -> HAproxy (load-balancing backend app)
> -> Rails (app nodes, clones)
> 
> - sharded memcache cluster (5 nodes), no failover currently (memcache
> cannot replicate :( )
> - redis nodes
> - mysql (3 nodes: active master, master, slave)
> - Solr (1 master, 2 slaves)
> - resque (many nodes)
> - NFS file storage pool (master/slave DRBD + ext3 fs currently, want
> to use GFS2/OCFS2 however)
>
> Now, I read alot about ppl saying a pacemaker cluster should not
> exceed 16 nodes, and many
> others saying this statement is bullsh**. While I now feel more with
> the latter, I still want to know:
>
>    is it still wise to built up a single pacemaker/corosync driven
> cluster out of all the services above?
>
> One question I also have, is, when pacemaker is managing your
> resources, and migrates
> one resource from one host (because this one went down) to another,
> then this service should
> be actually able to access all data on that node, too.
> Which leads to the assumption, that you have to install *everything*
> on every node, to be actually able
> to start anything anywhere (depending on where pacemaker is about to
> put it and the scores the admin
> has defined).

Well you can tell us not to put the service on a particular (set of) node(s).
Just make sure you have something recent and we should gracefully
detect that the RA/software isn't available and move on somewhere
else.

>
> Many thanks for your thoughts on this,
> Christian.
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Partially SOLVED] pacemaker/dlm problems

2012-01-15 Thread Vladislav Bogdanov
16.01.2012 09:17, Andrew Beekhof wrote:
[snip]
>> At the same time, stonith_admin -B succeeds.
>> The main difference I see is st_opt_sync_call in a latter case.
>> Will try to experiment with it.
> 
> /Shouldn't/ matter.

It really looks like it matters.

Can't discuss it at more depth though because of lack of underlying
design knowledge. :(

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] setup multimaster drbd with ocfs without o2cb and controld

2012-01-15 Thread Andrew Beekhof
On Fri, Jan 6, 2012 at 3:49 AM, thomas polnik  wrote:
> Hello,
>
> I want setup follow:
> 2 servers with multimaster drbd and ocfs filesystem and mount it.
>
> *system: gentoo, 2.6.39, pacemaker-1.1.5
>
> *Setup pacemaker:
>
> # misc settings
> property no-quorum-policy="ignore"
> property stonith-enabled="false"
> rsc_defaults resource-stickiness="200"
>
> # drbd setup
> primitive resDrbd ocf:linbit:drbd params drbd_resource="images" op start
> interval="0" timeout="240s" op stop interval="0" timeout="240s"
> ms msDrbd resDrbd meta master-max="2" clone-max="2" notify="true"
>
> # ocfs setup via lsb:ocfs2
> primitive resOcfs lsb:ocfs2 op monitor interval="20" timeout="40" meta
> is-managed="true" target-role="Started"
> clone clResOcfs resOcfs meta target-role="Started"
>
> # FS setup
> primitive resFsImages ocf:heartbeat:Filesystem \
>    params device="/dev/drbd/by-res/images" directory="/srv/images"
> fstype="ocfs2" options="rw,noatime" \
>    op start interval="0" timeout="60s" \
>    op stop interval="0" timeout="60s"
> clone clResFsImages resFsImages \
>    meta target-role="Started"
>
> # setup order of primitives
> order grDrbdOcfsFs inf: msDrbd:promote clResOcfs:start clResFsImages:start
>
> Problem:
> If one server goes down and comes back, the other node umount /srv/images,
> shutdown ocfs2 und stop drbd and then both nodes start the services again.
> After this, all services works fine again, but I have an outage duration of
> 5 seconds.

I think that was a bug in 1.1.5
Try 1.1.6?

>
> I think, this is not necessary, I don't know, why pacemaker shutdown all
> services on the running node.
>
> btw: I can not use ocf:ocfs2:o2cb and ocf:pacemaker:controld ([1]) because
> pacemaker-1.1.5 on a gentoo system does not offer me this agents. So I
> choose lsb:ocfs2 to use ocfs.

That means OCFS2 is using its own internal cluster comms.
So they and pacemaker may not agree on who the cluster members are at
some stage... then things are going to get /really/ interesting for
you.

>
> Has anybody a hint for me, how can I prevent it?
>
> Best regards,
> thomas.
>
>
> [1] http://www.drbd.org/users-guide/s-ocfs2-pacemaker.html
>
> ___
> Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

___
Pacemaker mailing list: Pacemaker@oss.clusterlabs.org
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


Re: [Pacemaker] [Problem]It is judged that a stopping resource is starting.

2012-01-15 Thread renayama19661014
Hi Andrew,

Thank you for comments.

> Could you send me the PE file related to this log please?
> 
> Jan  6 19:22:01 rh57-1 crmd: [3461]: info: do_te_invoke: Processing
> graph 4 (ref=pe_calc-dc-1325845321-26) derived from
> /var/lib/pengine/pe-input-4.bz2

The old file disappeared.
I send log and the PE file which reappeared in the same procedure.

 * trac1818.zip   
  * https://skydrive.live.com/?cid=3a14d57622c66876&id=3A14D57622C66876%21127

Best Regards,
Hideo Yamauchi.


--- On Mon, 2012/1/16, Andrew Beekhof  wrote:

> On Fri, Jan 6, 2012 at 12:37 PM,   wrote:
> > Hi Andrew,
> >
> > Thank you for comment.
> >
> >> But it should have a subsequent stop action which would set it back to
> >> being inactive.
> >> Did that not happen in this case?
> >
> > Yes.
> 
> Could you send me the PE file related to this log please?
> 
> Jan  6 19:22:01 rh57-1 crmd: [3461]: info: do_te_invoke: Processing
> graph 4 (ref=pe_calc-dc-1325845321-26) derived from
> /var/lib/pengine/pe-input-4.bz2
> 
> 
> 
> > Log of "verify_stopped" is only recorded.
> > The stop handling of resource that failed in probe was not carried out.
> >
> > -
> > # yamauchi PREV STOP ##
> > Jan  6 19:21:56 rh57-1 heartbeat: [3443]: info: killing 
> > /usr/lib64/heartbeat/ifcheckd process group 3462 with signal 15
> > Jan  6 19:21:56 rh57-1 ifcheckd: [3462]: info: crm_signal_dispatch: 
> > Invoking handler for signal 15: Terminated
> > Jan  6 19:21:56 rh57-1 ifcheckd: [3462]: info: do_node_walk: Requesting the 
> > list of configured nodes
> > Jan  6 19:21:58 rh57-1 ifcheckd: [3462]: info: main: Exiting ifcheckd
> > Jan  6 19:21:58 rh57-1 heartbeat: [3443]: info: killing 
> > /usr/lib64/heartbeat/crmd process group 3461 with signal 15
> > Jan  6 19:21:58 rh57-1 crmd: [3461]: info: crm_signal_dispatch: Invoking 
> > handler for signal 15: Terminated
> > Jan  6 19:21:58 rh57-1 crmd: [3461]: info: crm_shutdown: Requesting shutdown
> > Jan  6 19:21:58 rh57-1 crmd: [3461]: info: do_state_transition: State 
> > transition S_IDLE -> S_POLICY_ENGINE [ input=I_SHUTDOWN cause=C_SHUTDOWN 
> > origin=crm_shutdown ]
> > Jan  6 19:21:58 rh57-1 crmd: [3461]: info: do_state_transition: All 1 
> > cluster nodes are eligible to run resources.
> > Jan  6 19:21:58 rh57-1 crmd: [3461]: info: do_shutdown_req: Sending 
> > shutdown request to DC: rh57-1
> > Jan  6 19:21:59 rh57-1 crmd: [3461]: info: handle_shutdown_request: 
> > Creating shutdown request for rh57-1 (state=S_POLICY_ENGINE)
> > Jan  6 19:21:59 rh57-1 attrd: [3460]: info: attrd_trigger_update: Sending 
> > flush op to all hosts for: shutdown (1325845319)
> > Jan  6 19:21:59 rh57-1 attrd: [3460]: info: attrd_perform_update: Sent 
> > update 14: shutdown=1325845319
> > Jan  6 19:21:59 rh57-1 crmd: [3461]: info: abort_transition_graph: 
> > te_update_diff:150 - Triggered transition abort (complete=1, tag=nvpair, 
> > id=status-1fdd5e2a-44b6-44b9-9993-97fa120072a4-shutdown, name=shutdown, 
> > value=1325845319, magic=NA, cib=0.101.16) : Transient attribute: update
> > Jan  6 19:22:01 rh57-1 crmd: [3461]: info: crm_timer_popped: New Transition 
> > Timer (I_PE_CALC) just popped!
> > Jan  6 19:22:01 rh57-1 crmd: [3461]: info: do_pe_invoke: Query 44: 
> > Requesting the current CIB: S_POLICY_ENGINE
> > Jan  6 19:22:01 rh57-1 crmd: [3461]: info: do_pe_invoke_callback: Invoking 
> > the PE: query=44, ref=pe_calc-dc-1325845321-26, seq=1, quorate=0
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: unpack_config: On loss of 
> > CCM Quorum: Ignore
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: info: unpack_config: Node scores: 
> > 'red' = -INFINITY, 'yellow' = 0, 'green' = 0
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: WARN: unpack_nodes: Blind faith: 
> > not fencing unseen nodes
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: info: determine_online_status: Node 
> > rh57-1 is shutting down
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: ERROR: unpack_rsc_op: Hard error - 
> > prmVIP_monitor_0 failed with rc=6: Preventing prmVIP from re-starting 
> > anywhere in the cluster
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: group_print:  Resource 
> > Group: grpUltraMonkey
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: native_print:      prmVIP   
> >     (ocf::heartbeat:LVM):   Stopped
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: group_print:  Resource 
> > Group: grpStonith1
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: native_print:      
> > prmStonith1-2        (stonith:external/ssh): Stopped
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: native_print:      
> > prmStonith1-3        (stonith:meatware):     Stopped
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: group_print:  Resource 
> > Group: grpStonith2
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: native_print:      
> > prmStonith2-2        (stonith:external/ssh): Started rh57-1
> > Jan  6 19:22:01 rh57-1 pengine: [3464]: notice: native_print:      
> > prmStonith2-3        (stonith:meat