Re: how does exclude_period work?

2005-10-12 Thread Jim Trocki

On Tue, 11 Oct 2005, David Nolan wrote:

The segfault bug is trigger by calling a text parsing function (from a 
standard perl module, Text::Parsewords) with particulary large input.


It's a regexp that's in Text::Parsewords which chokes because the default
stack size on most systems isn't sufficient for it. I've seen it crash
one time, then when run after tweaking the stack size with ulimit,
it doesn't crash. Oh well. Intellectual curiosity satisfied, time to move on.


Jim was talking with me recently about actually designating something a 
stable version...  This seems like one more big reason to stop calling 0.99.2 
the stable version.  How about it Jim?  Call mon-1-1-0pre2 Mon 1.1 and cut a 
release?


Well, I like the release numbering convention that the Linux kernel uses,
where the first number to the right of the decimal point signifies a
stable release if it is an even number, or a development release if it
is an odd number.

I think we should just fork the cvs tree and call mon-1-1-0pre2 the
super fantabulous "mon 1.2" (tag it as mon-1-2-0), then the head will
be 1.3, the work-in-progress, possibly unstable, possibly stable,
experimental-feature-laden code.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: how does exclude_period work?

2005-10-12 Thread David Nolan



--On Wednesday, October 12, 2005 3:06 AM -0400 Jim Trocki 
<[EMAIL PROTECTED]> wrote:



Well, I like the release numbering convention that the Linux kernel uses,
where the first number to the right of the decimal point signifies a
stable release if it is an even number, or a development release if it
is an odd number.

I think we should just fork the cvs tree and call mon-1-1-0pre2 the
super fantabulous "mon 1.2" (tag it as mon-1-2-0), then the head will
be 1.3, the work-in-progress, possibly unstable, possibly stable,
experimental-feature-laden code.


Thats fine with me.  I like conventions and standards.  And in this case it 
means I can start tagging 1.3.* versions at will and maybe people will test 
them if we're lucky.


The only thing commited to CVS right now that I don't think belongs in 1.2 
is the global exclude_period feature I added yesterday.  Thats the only tag 
since mon-1-1-0pre2, so we can just re-tag those versions as 1.2.


-David


David Nolan<*>[EMAIL PROTECTED]
curses: May you be forced to grep the termcap of an unclean yacc while
 a herd of rogue emacs fsck your troff and vgrind your pathalias!

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


A few bugfixes missing from fantabulous mon...

2005-10-12 Thread Ed Ravin
On Wed, Oct 12, 2005 at 09:54:00AM -0400, David Nolan wrote:
> --On Wednesday, October 12, 2005 3:06 AM -0400 Jim Trocki 
> <[EMAIL PROTECTED]> wrote:
> >I think we should just fork the cvs tree and call mon-1-1-0pre2 the
> >super fantabulous "mon 1.2" (tag it as mon-1-2-0), then the head will
> >be 1.3, the work-in-progress, possibly unstable, possibly stable,
> >experimental-feature-laden code.
...
> The only thing commited to CVS right now that I don't think belongs in 1.2 
> is the global exclude_period feature I added yesterday.  Thats the only tag 
> since mon-1-1-0pre2, so we can just re-tag those versions as 1.2.

There are a couple of bugfixes that I reported to mon-devel that didn't
seem to make it into mon-1-1-0pre2 - I think they ought to be included
in "mon 1.2":

This fix is to prevent an upalert from being issued for an ack'd watch
that sent out an ackalet:

@@ -626,12 +626,12 @@
my $pref = \%{$sref->{"periods"}->{$periodlabel}};
 
#
-   # skip upalerts not paired with down alerts
+   # skip upalerts/ackalerts not paired with down alerts
# disable by setting "no_comp_alerts" in period section
#
-   if (!$pref->{"no_comp_alerts"} && ($flags & $FL_UPALERT) && !$pref->{"_a
lert_sent"})
+   if (!$pref->{"no_comp_alerts"} && ($flags & ($FL_UPALERT | $FL_ACKALERT)
) && !$pref->{"_alert_sent"})
{
-   syslog ('debug', "$group/$service/$periodlabel: Suppressing upalert 
since no down alert was sent.");
+   syslog ('debug', "$group/$service/$periodlabel: Suppressing upalert 
or ackalert since no down alert was sent.");
next;
}
 
--
This bugfix below makes sure upalerts have the right message from the
last failed monitor.  I forget whether this was related to getting ackalerts
working properly, but it clearly fixes a feature that wasn't doing what
it was supposed to:


@@ -3295,6 +3296,8 @@
 (!defined($sref->{"upalertafter"}) 
  || (($tmnow - $sref->{"_first_failure"}) >= $sref->{"upalertafter"
}
{
+   # Save the last failing monitor's output for posterity
+   $sref->{"_upalertoutput"}= $sref->{"_last_output"};
do_alert ($group, $service, $sref->{"_upalertoutput"}, 0, $FL_UPALER
T);
}
 


--
I also contributed a few fixes to the alerts that don't seem to
be in mon-1-1-0pre2 - none of the alerts knew about the options for the
new forms of alerts (like ackalerts and trapalerts).  Here are my local
patches to snpp.alert:

28c28
< use vars qw /$opt_g $opt_q $opt_s $opt_t $opt_u/;
---
> use vars qw /$opt_g $opt_q $opt_s $opt_t/;
50c50
< my $t = localtime ($opt_t || time);
---
> my $t = localtime ($opt_t);
55,57c55
< my $ALERT=   $opt_u ? "UPALERT" : "ALERT";
< my $GROUP=   $opt_g || $ENV{MON_GROUP};
< my $SERVICE= $opt_s || $ENV{MON_SERVICE};
---
> $ALERT = $opt_u ? "UPALERT" : "ALERT";
59c57
< $snpp->send ( Pager => [ @ARGV ], Message => "$ALERT $GROUP/$SERVICE: 
$summary ($wday $mon $day $tm)" );
---
> $snpp->send ( Pager => [ @ARGV ], Message => "$ALERT $opt_g/$opt_s: $summary 
> ($wday $mon $day $tm)" );

-

And finally, none of the new alert types (startupalert, ackalert, disablealert)
are documented in the Mon man page.

-- Ed

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon