Re: Syslog problems

2011-07-27 Thread Jim Trocki

On Wed, 27 Jul 2011, Allan Wind wrote:



Debian fixed this with Bug#611751 I believe.  Surprised if that
did not make it upstream.



It's in the CVS HEAD, has been for a while, which is newer than mon-1.2.0.
Time for a 1.2.1 I guess.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Failure to find designatated monitor reported as untested

2011-06-20 Thread Jim Trocki

On Wed, 15 Jun 2011, Nico Kadel-Garcia wrote:


I believe you, but wasn't seeing them in syslog, even with -d in the
command line.



If I can talk you, as the original author, to enhance your perl,  to
report a missing status rather than merely untested, I'd
appreciate that quite a lot.


i just checked in changes to the trunk for module mon and mon-client
which added a configerr state. there are changes to clients/monshow
and clients/mon.cgi which should reflect this. i tested it a bit and
it seems to work, so it'd be nice if you could check out the code and
confirm for yourself that it works.

it may take some amount of time before the read-only public cvs repo is
synced with the devel repo at sf. these are the revisions:



 cvs ci -m added configerr state
cvs commit: Examining .
cvs commit: Examining alert.d
cvs commit: Examining cgi-bin
cvs commit: Examining clients
cvs commit: Examining clients/skymon
cvs commit: Examining doc
cvs commit: Examining etc
cvs commit: Examining mon.d
cvs commit: Examining muxpect
cvs commit: Examining state.d
cvs commit: Examining utils
Checking in mon;
/cvsroot/mon/mon/mon,v  --  mon
new revision: 1.27; previous revision: 1.26
done
Mailing mon-com...@lists.sourceforge.net...
Generating notification message...
Generating notification message... done.
Checking in clients/mon.cgi;
/cvsroot/mon/mon/clients/mon.cgi,v  --  mon.cgi
new revision: 1.6; previous revision: 1.5
done
Checking in clients/monshow;
/cvsroot/mon/mon/clients/monshow,v  --  monshow
new revision: 1.4; previous revision: 1.3
done
Mailing mon-com...@lists.sourceforge.net...
Generating notification message...
Generating notification message... done.
Checking in doc/mon.8;
/cvsroot/mon/mon/doc/mon.8,v  --  mon.8
new revision: 1.8; previous revision: 1.7
done
Mailing mon-com...@lists.sourceforge.net...
Generating notification message...
Generating notification message... done.

cvs ci -m added configerr state
cvs commit: Examining .
cvs commit: Examining Mon
Checking in Mon/Client.pm;
/cvsroot/mon/mon-client/Mon/Client.pm,v  --  Client.pm
new revision: 1.4; previous revision: 1.3
done
Mailing mon-com...@lists.sourceforge.net...
Generating notification message...
Generating notification message... done.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Failure to find designatated monitor reported as untested

2011-06-14 Thread Jim Trocki

On Tue, 14 Jun 2011, Nico Kadel-Garcia wrote:


Not being executable, I think, should be the test, rather than merely
not being present.


well, both are tested and if problems are found, reported to syslog.
other related troubles are reported as well.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: New mon admin, some questions

2011-06-13 Thread Jim Trocki

On Sat, 11 Jun 2011, Nico Kadel-Garcia wrote:


Thanks Given that the published tarball is from 2007, perhaps it's
time then to collect some patches and do a minor update release?


yes.


And what are the chances of getting that shifted over to Sourceforge's
supported git access, to allow people like me to do local patchind and
tweaks and branching and submit the changes when we're ready?



I've helped migrate CVS or Subversion projects on Sourceforge to the
git access before. It's quite easy, and works well. It also helps
encourage minor upgrades to be submitted.


until now there's been no demand for it. i'm open to it, however, as a
good excuse to learn git.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: New mon admin, some questions

2011-06-11 Thread Jim Trocki

On Sat, 11 Jun 2011, Nico Kadel-Garcia wrote:


Is there a central code base now for updates? The Wiki and codebase at
https://mon.wiki.kernel.org/index.php/Main_Page/ seems to have last
been updated in 2007. I do see some published patches, mostly from
Nathan


the most recent code is available from cvs.

http://sourceforge.net/scm/?type=cvsgroup_id=170

this includes mon, the client lib, and the contrib repository, all of
which are maintained and have recent additions.

you can browse it here:

http://mon.cvs.sourceforge.net/mon/

nathan's enhancements, which are appreciated, are maintained by him on
his own web pages.


I'm also running into issues with daemontools integration rather than
the normal init scripts published by my supported Linux distribution
(which I've been trying to integrate per a local demand), and would
happily publish notes for it.


sure, whatever you have that may be helpful to others is welcome. post
it to the mailing list, and we'll incorporate it into cvs.


I'm also running into issues with monshow output clipping the names
of services and groups automatically to align the columns, which makes
descriptive service names kind of tough to use. Has anyone published a
tweak to monshow to create more verbose output?


yes, it uses perl's format feature for textual output, and the format
definition will truncate a field which is too long. one could change
the format definition to allow for a wider field, or change the textual
output code to dynamically size the column width, or to use something
other than the built-in format mechanism.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: UPALERT ouput

2011-02-07 Thread Jim Trocki

On Sun, 6 Feb 2011, gulikoza wrote:


I'm new at setting mon and I have a question regarding UPALERT content.
UPALERTs seem to include MON_LAST_OUTPUT instead of current monitor output.
This seems very confusing to me. I am currently only working with (my
modified) DRBD monitor so I don?t know how other monitors handle that.
Please excuse me if there is some really important reason why UPALERT should
contain last error which apparently I?ve missed, but having mon send me
current output seems to make a lot more sense :). Instead of:


[... lots of newlines]



Above output was achieved applying the following patch:


- do_alert ($group, $service, $sref-gt;{quot;_upalertoutputquot;}, 0,
$FL_UPALERT);
+ do_alert ($group, $service, $output, 0, $FL_UPALERT);



sure, that's acceptable, as long as the alert script knows what to expect.

the idea for sending the previous output to the upalert script is so
you can easily send an alert that shows what error condition was fixed.

i guess another way to do it is to send both the previous and the current
output to the alert script and let it decide what to do, or to provide
a way to configure which to send to the alert.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Mon Wiki Slow

2011-02-02 Thread Jim Trocki

On Tue, 1 Feb 2011, Brandon S Allbery KF8NH wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 2/1/11 20:40 , Nathan Gibbs wrote:

Same Here, its been crawling ever since it was switched to https.
I mean, seriously, are the wiki contents that classified? ;-)
If we needed an encrypted data stream, we could just read some of the mon code
in plain text.


Google firesheep.  Also, the last machine I saw https overhead cause
significant performance issues on was a Sparc IPX; I have to think it's a
bit more involved than that.


right, even a cursory inspection shows that it has nothing to do with
ssl. the certificate is sent immediately, it's verified by the client,
the cipher spec is established, the symmetrical key exchange happens,
and the encrypted channel begins working without hesitation. apache
responds immediately to bad urls. however, the response for the wiki
server takes the time, not anything related to the ssl protocol. brandon,
i agree that something else is the complication.

i had a brief chat with the admin, he's aware of the problem and knows
that the behavior has changed, but he has a higher-priority iron in the
fire at the moment and he'll continue working on the wiki issue once
that cools off. so, thanks for the reports everyone, and hang in there.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: dtlog error

2011-02-01 Thread Jim Trocki

On Tue, 1 Feb 2011, ad...@jack-clan.nl wrote:


Hi,

*I installed mon in ubuntu server lts 10.04, i have it running fine, but i 
can't get to the downtime log:*


Could not list downtime log on mon server localhost: 520 list dtlog error, 
dtlogging is not turned on (perhaps you don't have permissions in auth.cf?)


*I checked auth.cf and have set all the rights to all:



ack:all


try adding to auth.cf:

dtlog: all

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Versioned 'mon-contrib' download

2010-10-30 Thread Jim Trocki

On Sat, 30 Oct 2010, Dario Minnucci wrote:


I'ts important to me to find a way to get a versioned 'mon-contrib' to be able 
to detect changes on
upstream tgz file and keep the Debian package up-to-date.


https://sourceforge.net/projects/mon/files/

See mon-contrib-1.0.tar.gz. I've tagged the mon-contrib module in the
sf repository with mon-contrib-1-0 and exported that, which is exactly
that tarball.

Is that sufficient?

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: MON on windows

2010-09-13 Thread Jim Trocki

On Mon, 13 Sep 2010, Anders Synstad wrote:


Hello!

I was wondering if anyone had ever tried getting MON to work on windows, 
using something like ActivePerl og Strawberry Perl?


It's been a little while since I last looked into it, but if I remember 
correct, the problem of getting it run was related to the socked handling (I 
think).



If anyone has looked into it, or done some modifications to make it work, I 
am very interested in hearing from you.


I've run it on Windows via Cygwin with no modifications.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: multi RBL monitor

2010-05-22 Thread Jim Trocki

On Sat, 22 May 2010, Ed Ravin wrote:


Looking back over past posts to the Mon list, I see that the script
I just re-submitted to the list was inspired by Tim Hanes's work, but
I had to do a total rewrite to use asynchronous I/O.



Speaking of credit, I see my author line was edited out of the version
Jim put into CVS.  What's that about?


I checked into CVS exactly what was in your post, so I don't have the
answer to that.

I've edited what's in there now to include proper attribution.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Mon Wiki Dead Links

2010-05-22 Thread Jim Trocki

On Thu, 20 May 2010, Nathan Gibbs wrote:


These links on the references page of the wiki don't go to what they claim.

16  How to monitor your system from your mobile, from PC Plus
17  Monitoring services using Mon, Detect and Alert, by Alvaro del Castillo
20  Jesper Krogh's Jabber Transport mon script
24  Using Jabber with Mon and Nagios

Also on this page.

https://mon.wiki.kernel.org/index.php/Monitors

The motion detection link at the bottom is dead.


updated

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: multi RBL monitor

2010-05-21 Thread Jim Trocki

On Sat, 22 May 2010, Noel Butler wrote:


Hrmm, it is not there now thats why you cant find it. The original was
from 2003,  I dont know why it was pulled, ill find out.


http://linux.kernel.org/pipermail/mon/2007-July/001645.html

Not sure what happened to it in the repo. Maybe I fatfingered something.
I know in the past some other things Ed has posted to the list haven't
made it into the repo, so sorry for that.

I just added it again, and it should show up in the read-only cvs repo on
sourceforge shortly.  It's in monitors/dnsbl, and I named it rbl.monitor
for the sake of historical record :)

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Fwd: Re: Fwd: Re: getting mon to execute shell command

2010-04-26 Thread Jim Trocki

On Mon, 26 Apr 2010, ad...@jack-clan.nl wrote:




yes, now it works,

but if i start mon as daemon ,eg /etc/init.d/mon start
it doesn't work


it's probably your environment. be more explicit with the paths to the
programs you call from your alert script by either setting a path at
the top of the script or by specifying the full path to each.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Fwd: Re: Fwd: Re: Fwd: Re: getting mon to execute shell command

2010-04-26 Thread Jim Trocki

On Mon, 26 Apr 2010, ad...@jack-clan.nl wrote:


ok, i read a lot about setting paths in your bash profile but that isn't
what you mean i think,
a daemon is already active without a user logged in?

but when i look at my script i specify the path to gammu in the second
line :
that should be enough or am i thinking the wrong way?


#!/usr/bin/bash

echo servers down test | /usr/bin/gammu --sendsms TEXT 12345678


if it works when you do -d -f then from that shell do

echo $PATH

and put that path at the top of your bash script, e.g.

#!/usr/bin/bash

export PATH=/bin:/usr/bin:whatever


___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Fwd: Re: getting mon to execute shell command

2010-04-25 Thread Jim Trocki

On Sun, 25 Apr 2010, ad...@jack-clan.nl wrote:


hostgroup domain
   domain.org

watch domain
   service tcp
   interval 1m
   monitor tcp.monitor -p 16001
   period
   numalerts 2
   alert sms.alert
   upalert sms.alert


you need to specify a period, such as wd {Sun-Sat}


period wd {Sun-Sat}
numalerts 2
alert sms.alert
upalert sms.alert

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Define Working Day

2010-03-23 Thread Jim Trocki

On Tue, 23 Mar 2010, Bryan Chapman wrote:


Hi,



Here is a snippet of our MON config



define(_ALWAYS_,`$1 wd {Mon-Sun}')dnl

define(_WORKING_HOURS_,`$1 wd {Mon-Fri} hr {9am-4pm}')dnl

define(_OOH_,$1 `wd {Mon-Fri} hr {4pm-9am},wd {Sat-Sun}')dnl



I wish to change the working day to 0715 to 0545 but I'm at a loss...


week begins 0715 monday morning and ends 0545 saturday morning:

wd {Mon-Sat} hr {8am-4am}, wd {Tue-Sat} hr {5am} min {0-45}, wd {Mon-Fri} hr 
{7am} min {15-59}

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Feature Request HostGroup in environment.

2010-01-06 Thread Jim Trocki

On Wed, 6 Jan 2010, David Nolan wrote:


I think thats a great idea.

Which is probably why it already exists... :)

try MON_GROUP and MON_SERVICE

(I see that the documentation doesn't list those for monitors, only
for alerts, but they do exist and work.)


i just updated mon.8 to reflect all of the environment variables set in
run_monitor and in call_alert. there were a couple things missing.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: All Right! Where is it.

2009-12-19 Thread Jim Trocki

On Sat, 19 Dec 2009, Nathan Gibbs wrote:


While reading this page

http://mon.wiki.kernel.org/index.php/Monitors

On the wiki, I noticed something.

In the List was.
Uptime via Asynchronous SNMP



2. Is it the asyncreboot.monitor?


yes

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Mon alerts to DB

2009-12-17 Thread Jim Trocki

On Thu, 17 Dec 2009, Anthony wrote:


Hi,
Could someone give me some direction of how I could get Mon alerts into a 
database? The objective it to look into creating an alert management system 
requiring a DB.


I agree with Hans Peter's advice, to write an alert script which pumps
the details of the failure into the database. If you want to capture
both failures and non-failures, look at the redistribute config option.

If the DB is something which is connection-oriented, such as mysql
or postgres, it would be better to write a simple daemon which stays
connected listens on an AF_UNIX socket for your alert messages.  your
alert script could then write to that socket so it does not need to log
into the db each time it is invoked. that'll help minimize the overhead.

also, there are alternate syslog daemons such as rsyslog which can store
the log data in various databases.


It doesn't seem Mon can input to a DB nativity.


o/~ joy to the world, the DB has come...

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: LDAPS monitoring on port 636

2009-12-16 Thread Jim Trocki

On Wed, 16 Dec 2009, Smaïne Kahlouch wrote:


Hi guys,

I would like to know if it's possible to monitor my ldap server using
the ldap.monitor or another way and by issuing a request to the port 636
instead of port 389.


simply use monitor ldap.monitor --port 636
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: opstatus meanings

2009-12-14 Thread Jim Trocki

On Mon, 14 Dec 2009, Alex Dean wrote:


I'm not always sure what to make of the output of 'moncmd show opstatus'.

0 = failure
1 = ok

These seem pretty clear (though they're backward from normal exit codes, and 
that confused me at first).  I sometimes also see status '7', which seems to 
happen when mon starts up, before the monitor has been run.  It also pops up 
if there's some configuration problem, like mon can't find the monitor 
script.


Is there any more complete list of opstatus values and their meanings?  I saw 
this question from the mailing list back in 2005, but there was no response.

http://www.mail-archive.com/mon@linux.kernel.org/msg01647.html


#
# operational statuses
#
($STAT_FAIL, $STAT_OK, $STAT_COLDSTART, $STAT_WARMSTART, $STAT_LINKDOWN,
 $STAT_UNKNOWN, $STAT_TIMEOUT, $STAT_UNTESTED, $STAT_DEPEND, $STAT_WARN) = 
(0..9);



: uplift ~/mon/export/mon-1.2.0$; cat doc/README.variables 
$Id: README.variables,v 1.1.1.1 2004/06/09 05:18:06 trockij Exp $


[...]


_op_status
STAT_FAIL   the monitor returned a failure
STAT_OK the monitor returned a success
STAT_COLDSTART  a coldstart trap was received
STAT_WARMSTART  a warmstart trap was received
STAT_LINKDOWN   a linkdown trap was received
STAT_UNKOWN unknown (reserved for stupid things)
STAT_TIMEOUTa trap timeout occurred
STAT_UNTESTED   this service has not yet been tested
STAT_DEPEND this service has been marked by the depend routines
STAT_WARN   a warning state

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Updated Clam AV monitor

2009-10-31 Thread Jim Trocki

On Sat, 31 Oct 2009, Nathan Gibbs wrote:


* Nathan Gibbs wrote:

I just updated the Clam AV monitor.



The Clamav Team listed this monitor on their site.


That's good news, but a URL for it would make the good news better :)

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Scheduled downtime functionality

2009-10-16 Thread Jim Trocki

On Sat, 17 Oct 2009, Res wrote:


I'm currently planning to implement this as a second process that
continuously reads a flat csv file containing what,when,how long\n.


Absolutely yes, IMHO it is the only place to add such a function, since it is 
the governing process.


i already have something which can do that, with minor mods. i doubt
anyone knows about it, though :)

http://search.cpan.org/~trockij/

the ones named schedule.

the config is kept in csv format so it can be edited easily with a
spreadsheet.

from the pod:

Schedule::Oncall provides methods to manipulate an on-call schedule.
One or more tables of schedules can be maintained, loaded, and
searched.  An on-call table is composed of seven days, where each
day has a list of minute ranges which correspond to a particular person.

Information such as email address, pager number, etc. may be stored in
the schedule configuration file. Simple variable assignments may also
be made. Other textual information may be stored in the schedule in
order to assist other applications (e.g., html headers or email body
text), and variables substitution may occur within the text blocks.

Schedule files may be chosen based on weekly or monthly rotations,
relative to the first week or month of the year. Weekly schedules
begin on a Monday and end on a Sunday, the same as strftime(3)'s
%W format. Each rotation is stored in a separate file, and the
appropriate rotation is chosen at load time.

also there is some discussion of it here:

http://www.kernel.org/pub/software/admin/mon/mon-talk-1.2.pdf

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: mon.cgi developement

2009-10-09 Thread Jim Trocki

On Sat, 3 Oct 2009, Nathan Gibbs wrote:


I have a revised version of mon.cgi at
http://www.cmpublishers.com/oss/


this system is inaccessible from some networks. the first syn leaves
but is unanswered.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: mon.cgi developement

2009-10-08 Thread Jim Trocki

On Tue, 6 Oct 2009, Nathan Gibbs wrote:


http://blogs.umass.edu/choogend/2007/09/12/mon-%E2%80%94-the-ultimate-minimalist-monitoring-tool/




Nice write up.  Someone should put that on the mon wiki.


added to the references page on the wiki. thanks, everyone.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: monitoring snmp traps

2009-04-03 Thread Jim Trocki
On Thu, 2 Apr 2009, Jim Trocki wrote:

 snmptrapd can be told to decode traps matching particular OIDs and send
 them to an external program's stdin, or it can forward traps matching
 some OIDs to another destination, on the same machine or elsewhere.


i did a bit more poking around and discovered NetSNMP::TrapReceiver

NAME
NetSNMP::TrapReceiver - Embedded perl trap handling for Net-SNMP's
snmptrapd


ABSTRACT
The NetSNMP::TrapReceiver module is used to register perl subroutines
into the Net-SNMP snmptrapd process.


...

http://cpansearch.perl.org/src/HARDAKER/NetSNMP-TrapReceiver-5.0401/

man NetSNMP::TrapReceiver has a detailed example. it looks as if you
could write a simple perl routine which would look at the oid and then
convert a snmp trap into a mon trap, all from within the snmptrapd
process. that sounds like the beauty way to go.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: monitoring snmp traps

2009-04-02 Thread Jim Trocki
On Tue, 31 Mar 2009, Mike Ireton wrote:

 running on the same host. The snmp traps received by snmptrapd get
 written to syslog where I have a process (syslog-ng / swatch) running to
 do primitive decoding of the messages and sending alerts under some

   I am no snmp guru but it seems that it should be possible and
 desireable for there to be many snmp trap receivers on the same host,
 each handling one or more subsets of the mib tree.

   My question simply is how might anyone suggest to handle multiple traps
 from different devices or is there something obvious in the net-snmp
 package that I've missed or perhaps a mon feature I could use for this
 purpose?

snmptrapd can be told to decode traps matching particular OIDs and send
them to an external program's stdin, or it can forward traps matching
some OIDs to another destination, on the same machine or elsewhere.
check out the man page for snmptrapd.conf, the traphandle and forward
settings.

you may be able to run a separate snmptrapd on the same host but listening on
a different port with a different configuration which uses traphandle to
call your mon-related trap processor, and the main snmptrapd would use the
forward config to send the work to the appropriate handler. this
would free up the main snmptrapd to handle other logging and forwarding
while allowing you to use multiple processes to handle the specific work.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: unidentified output from fping

2009-04-02 Thread Jim Trocki
On Wed, 1 Apr 2009, Alain wrote:

 I looked at fping.monitor from the latest distribution (mon-1.2.0) and
 see there's been a number of changes since the version I was using
 (0.99.2-13). Sure enough using this latest version of fping.monitor
 resolved them problem. However, I'm still curious what exactly the old
 fping.monitor saw that the new one doesn't? Any ideas?

the adjustment was to handle some extra icmp messages from routers which
indicate that a host was unreachable, rather than relying on just the timeout.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Config File Questions

2008-06-05 Thread Jim Trocki
On Thu, 5 Jun 2008, Bryan Chapman wrote:

 Hi all,



 Our config file is getting a bit out of hand, its 2600 lines long now
 and a bit of a head ache to manage.



 Is it possible to have a main configs file and to 'include' other
 sub-config files?

certainly. there are a few ways to do this.

out of the box, if you use a config file that ends with .m4, mon will
pass that through m4 and use the results as the config file. then
you could use m4's features to do the including and tons of fancy
stuff if you want. for example:

mon.m4:

# this is the main config file
include(part1.cf)dnl

include(part2.cf)dnl


part1.cf:
here's part 1


part2.cf:
here's part 2


should result in:

# this is the main config file
here's part 1

here's part 2


you may have to fiddle with the paths so the include knows where to
look.

you could also just use whatever other kind of macro-processing tool you have
an affinity for, and then just make the output go to mon.cf before you tell mon
to load it.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: alert on success rather than failure

2008-04-11 Thread Jim Trocki
On Fri, 11 Apr 2008, aneeskA wrote:

 hi all,

 is there a way to make an alert when something succeeds rather than failure
 ? i was able to achieve this by reversing the return value that monitors
 give back but i think thats not the proper way to do it. Any thoughts on
 this ?

there is an option at the service level called redistribute which
will call an alert every time a monitor terminates, regardless of the
return status.


watch abc
 service xyz
redistribute alertname arg1 arg2 ...
period wd {mon-fri}
...


the alert would get the exit status from the environment variable
MON_RETVAL, so, given that, your alert could decide what to do.

this is a feature in mon 1.2.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: cisco_interface.monitor

2007-12-27 Thread Jim Trocki
On Thu, 27 Dec 2007, Alex Moen wrote:

 Looking to see if anyone has updated the cisco_interface.monitor to work with 
 modern perl implementations.  I am by no means a perlmonger, so am

I'm not familiar with this one. Where'd you find it?

There is something called snmp_interface.monitor in the contrib stuff
which uses the module from net-snmp.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: cisco_interface.monitor

2007-12-27 Thread Jim Trocki
On Thu, 27 Dec 2007, Alex Moen wrote:

 It was in the contribs area... Copyright (C) 1998, Brian Moore [EMAIL 
 PROTECTED], 
 modified July 2000 by Ed Ravin [EMAIL PROTECTED]

Oh, well then that looks the same as the snmp_interface.monitor in the contrib.
Maybe someone renamed it at some point on your end or maybe elsewhere, but I've
never seen it called that.

Anyway, you said snmpvar.monitor also exhibits this behavior, so I just tried
it on stock CentOS 5 and Fedora Core 6 installations I have in front of me
here, and it worked properly.  I also tried the snmp_interface.monitor and I
didn't see the error you reported about it not finding SNMP::Session. I also
recall testing the snmpvar.monitor which ships with mon-1.2.0 (obtained from
the contrib repository anyway) on a SLES 10 installation this summer and it was
fine.

Here's the version of the perl module from centos/rhel5:

Name: net-snmp-perlRelocations: (not relocatable)
Version : 5.3.1 Vendor: CentOS
Release : 19.el5_1.3Build Date: Tue 18 Dec 2007 
06:47:56 PM EST

What's yours? You could find out this way as well:

$ perl -MSNMP -e 'print $SNMP::VERSION, \n'
5.0301

Be sure to use the same path to perl as the one the script is using.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Using depend ...

2007-12-10 Thread Jim Trocki
On Mon, 10 Dec 2007, Jacques Klein wrote:

 also dep_behavior = hm
 I am using mon-1.2.0 where I added


 shows that$sref-{_last_failure_time} is used but never set .


Try this patch and let me know if it helps:


--- mon 2007-12-10 13:35:43.0 -0500
+++ mon-dep 2007-12-10 13:38:48.0 -0500
@@ -1392,6 +1392,7 @@
$sref-{_start_of_monitor} = time if 
(!defined($sref-{_start_of_monitor}));
$sref-{_alert_count} = 0 if 
(!defined($sref-{_alert_count}));
$sref-{_last_failure} = 0 if 
(!defined($sref-{_last_failure}));
+   $sref-{_last_failure_time} = 0 if 
(!defined($sref-{_last_failure_time}));
$sref-{_last_success} = 0 if 
(!defined($sref-{_last_success}));
$sref-{_last_trap} = 0 if (!defined($sref-{_last_trap}));
$sref-{_last_traphost} = '' if 
(!defined($sref-{_last_traphost}));
@@ -3287,6 +3288,7 @@
$sref-{_failure_count}++;
$sref-{_consec_failures}++;
$sref-{_last_failure} = $tmnow;
+   $sref-{_last_failure_time} = $tmnow; # used by the dep_memory 
option
if ($sref-{_op_status} == $STAT_OK ||
$sref-{_op_status} == $STAT_UNKNOWN ||
$sref-{_op_status} == $STAT_UNTESTED)

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: avoid duplicated alerts in a multi-host/mon context

2007-10-17 Thread Jim Trocki
On Wed, 17 Oct 2007, Jacques Klein wrote:

 If I understand the depend, it's a way to avoid multiple alerts by
 specifying dependencies between services in ONE mon.
 If I take this concept, then it would have to be extended to
 dependencies between services in a GROUP of mon(s) (one per host),
 interesting but seems very complicated.

Yes, one of the ways you could implement this functionality is by using
traps to feed the status to a mon server which uses this input to control
the alerts and implement the dependencies. You are on the right track in
what you said in your previous mail.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: don't listen with mon

2007-10-11 Thread Jim Trocki
On Thu, 11 Oct 2007, board.divers wrote:

 Hello,

 I would like to know if It's possible to stop MON to listen on udp and tcp.

 I've tried to comment some part on the code ( listen(SERVER,OCONNECT) by
 example) but all my tests have ended with my CPU at 100% ...

you can't do that, but if you don't want external things talking to it from the
network you can tell it to bind to only 127.0.0.1. look at the serverbind and
trapbind settings in the man page.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: mon project

2007-08-26 Thread Jim Trocki
On Sun, 26 Aug 2007, Augie Schwer wrote:

 Ooops. Duh. I wasn't on the mon@ list until recently; it doesn't show
 up on the SF.net page:

 http://sourceforge.net/mail/?group_id=170

shoot. i used to have a script which would push out any releases or web page
changes to the various places automatically, but since we have the wiki
i didn't bother with it. i guess i'll need to do some housecleaning and
make everything refer to the wiki for information so i don't need to
maintain stuff in a million places. the wiki i like alot, and it makes
it much easier to allow others to participate.

to anyone who spends time editing wikis, i recommend these firefox/mozilla
extensions, which allow you to edit the TEXTAREA in your editor of choice,
and when you save it's automatically updated in the form:

https://addons.mozilla.org/en-US/firefox/addon/4125

and

http://mozex.mozdev.org/

the inherent editing features most browsers supply are crippling, if anyone's
noticed.

a historical tidbit: the reason why mon has a sourceforge entry in the first
place is because before sf.net even went public, they seeded it with various
projects, and mon was part of the first sowing. it wasn't even my idea :)

 Any chance the freshmeat page will get updated to reflect the new release:

 http://freshmeat.net/projects/mon/

ah, good catch. i'll update that and make it refer to the right place.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: mon project

2007-08-25 Thread Jim Trocki
On Sat, 25 Aug 2007, Augie Schwer wrote:

 As you can see there is a lot of development in CVS, but there hasn't
 been an official release for a while;

Except for the one at the end of June which had a significant number of
improvements, some of which you contributed yourself :)

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: no_comp_alerts does not work when no alert script called

2007-07-25 Thread Jim Trocki
On Tue, 24 Jul 2007, Nicolas KOWALSKI wrote:

 +   $pref-{_no_comp_alerts_upalert_sent} = 0;
 +   }
 +
 #
 # skip upalerts/ackalerts not paired with down alerts
 # disable by setting no_comp_alerts in period section


 Does it look reasonable ?

 Anyone about it ?

Looks ok. I'll test it out over here soon.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


ANNOUNCE: mon-1.2.0 and mon-client-1.2.0

2007-06-27 Thread Jim Trocki
The latest stable release is mon-1.2.0, and is accompanied by the Mon::Client
Perl module, mon-client-1.2.0.

We have a new wiki as well, from which you can find the download link and
a good deal of other information:

http://mon.wiki.kernel.org/index.php/Main_Page

I'd like to thank David Nolan, Ed Ravin, and Jon Meek for their extensive
contributions and interest in the project. I'd also like to thank everyone else
who has participated by submitting bug reports (and fixes), suggestions, and
monitor/alert code.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


motion detection

2007-04-16 Thread Jim Trocki
I was able to whip up an idea for a monitor for mon which does some very
rudimentary motion detection. It didn't take very long to come up with. I wrote
it up here, for amusement:

http://arctic.org/~trockij/mon-motion-detector/

does that look like it would work reasonably well? the application would
be an image which doesn't change much until something moves into the frame,
like a car driving into the driveway or a person walking into a room.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Disable all alerting for 20 minutes

2006-12-13 Thread Jim Trocki
On Wed, 13 Dec 2006, David Nolan wrote:

 You could also do something like write a script that uses Mon::Client and
 disables all hostgroups.  (This would show the status updates in the UI
 without sending alerts, at least with the current (CVS, 1.2.0rc1) Mon it
 would, I can't remember whether 0.99.2 did that.)

it would probably require less effort to just add a holdalerts feature
to the server, or something of that nature.

i can imagine this could be done a few different ways:

 1. walk through the watch structure and disable each

 2. have a global hold alerts flag which leaves the
watch structure alone but is respected by do_alert

i'd lean towards #2 because it wouldn't blow away any previously
disabled watches or services.

how's that sound?

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Disable all alerting for 20 minutes

2006-12-13 Thread Jim Trocki
On Wed, 13 Dec 2006, David Nolan wrote:

 I'm understanding you correctly Nagios provides a way to enter a
 one-time scheduled maintenance period via the interface?  I could see
 adding that to Mon, but would you want it to be global, or would you
 need a way to restrict it to a subset of the hostgroups?

it would be best to implement it both ways. then, people could just pick
which behavior they want.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Starting

2006-09-08 Thread Jim Trocki
On Fri, 8 Sep 2006, David Nolan wrote:

 My best summary of Mon is that its monitoring for sysadmins.

i totally concur with david. what he said is spot-on.

i will add a few things, though:

the design of mon is extremely flexible, and was purposefully built the way it
was in order to leverage other tools which already exist. it follows the
traditional Unix design philosophy, which i think is the most elegant system
design in existence to this very day. it is all about having a mechanism to
connect together lots of smaller tools which do one job very well in order to
solve larger problems, rather than writing a large tool for each new problem.

you can also think of this design in terms of using natural language, words and
grammar to phrase something you want to say. perl itself also follows this
model. larry wall is a linguist (a cunning one at that, sorry couldn't resist
the pun), and he applied that to perl.

for example, mon leverages fping, the net-snmp tools, traceroute, rrdtool, etc.
another example of mon's flexibility is how an on-call notification system with
escalation was added without changing anything in mon at all, it was just a
matter of writing a custom alert and plugging it in to your mon configuration
file with the correct grammar.

in order to get a good idea of how mon works, i would recommend
reading the slides from this presentation:

ftp://ftp.kernel.org/pub/software/admin/mon/mon-talk-0.4.tar.gz

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


RE: Unable to pass options in config file.

2006-09-05 Thread Jim Trocki
On Tue, 5 Sep 2006, Tim Carr wrote:

alert DHCPMonitor.alert -q primary



 There is nothing that is registered for the -q option by getopts.
 We've tried -x primary as well.  If, however, we run the alert
 manually with that option we do get something in the -q flag.

ok, i'll look into it. i don't recall purposefully restricting parameters sent
to the alert, but maybe it is just a bug in the alert calling routine.

i'll look into it. should be simple to fix if indeed that is the problem.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: CVS Access broke?

2006-09-01 Thread Jim Trocki
On Fri, 1 Sep 2006, David Nolan wrote:

 He really should be using at least mon-1-1-0pre3 there werw a couple 
 significant bugs in pre1, and there have been a couple minor fixes since 
 then.  Jim, if I tag the current code as mon-1-1-0pre4 and 
 mon-client-1-1-0pre3 can you put up tarballs of both of those, and maybe of 
 mon-contrib as well?  If you don't have the time I can put up images 
 somewhere else.

ok sure, and i guess we should just fork it and call the branch 1.2, or 2.0.
the head trunk we can begin calling 1.3 or 2.1, following the odd #s devel,
even #s stable paradigm.  i can take care of that and the updates to the web
page and other related stuff sometime within the next week.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: CVS Access broke?

2006-09-01 Thread Jim Trocki
On Fri, 1 Sep 2006, Bill Chmura wrote:

 I can attempt to get the cvs version again today.  I just stopped
 consulting and have taken up residence in an office with a few too many
 windows servers - I really want to get monitoring up so I know when to
 reset them :)

dude, just hook up a watchdog timer to the electrical circuits and reboot all
of them every evening or as needed, whichever comes first. you could write a
custom mon alert to take care of that :)

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: CVS Access broke?

2006-08-31 Thread Jim Trocki
On Thu, 31 Aug 2006, Bill Chmura wrote:

 Which version is recommended at this point?

this should do you well:

ftp://ftp.kernel.org/pub/software/admin/mon/devel
 mon-1.1.0pre1.tar.gz
 mon-client-1.0.0pre2.tar.gz

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


RE: Getting 20 instead of spaces

2006-07-27 Thread Jim Trocki
On Wed, 26 Jul 2006, Tim Carr wrote:

 Is there a later version somewhere?  In looking through the mailing
 lists, there are several proposed patches after that date, but I don't
 see anything that relates to this problem.

use this one:

ftp://ftp.kernel.org/pub/software/admin/mon/devel/mon-client-1.0.0pre2.tar.gz

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


'Sunnyvale, You Have a Problem'

2006-05-18 Thread Jim Trocki

Think of a worldwide network of mon servers

http://online.wsj.com/public/article/SB114789800361955745-9Vxl4LJnClzLgOS1pOWq97vMX8M_20070517.html?mod=blogs

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: pager alert continues to page multiple times after single failure

2006-04-20 Thread Jim Trocki

On Wed, 19 Apr 2006, Jon Meek wrote:


That blank line after monitor http.monitor is probably not a good thing.



yeah, that is the problem. a blank line signifies the end of a watch record.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: pager alert continues to page multiple times after single failure

2006-04-20 Thread Jim Trocki

On Thu, 20 Apr 2006, Brendan Mullen wrote:

The locally modified qpage.alert worked on an older version of Mon, but not 
1.0pre5   The page would be sent but never show up in the alert history, and 
then would be sent again.  and again...


Hmm.

Check your logs. Mon syslogs when an alert exits with a nonzero
status:

if ($exitval)
{
syslog (err, child alert for  .
$args{group}/$args{service}  .
failed, exited with $exitval);
return undef;
}

If this is happening, it can screw up the alert management stuff (e.g.
_last_alert will not get updated).

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Monitor a wireless network

2006-03-11 Thread Jim Trocki

On Sat, 11 Mar 2006, Lee Sanders wrote:


Is that all there is to it, any hints/tips ?


very close.

the first line of output from the monitor script is the summary of all
the failures, usually a list of hosts from the hostgroup that failed,
or alternatively some text that would be suitable to be put into the
subject of an email, or something sent to an alphanumeric pager.

have a look at the slides from the mon talk for some examples:

ftp://ftp.kernel.org/pub/software/admin/mon/mon-talk-0.4.tar.gz

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: snmptrap2mon converter

2006-02-28 Thread Jim Trocki

On Thu, 2 Feb 2006, Eric Sorenson wrote:


I had written something like this in the past that got lost, I'm
sending out this version to the list for storage in contrib/ or,
minimally, in google's archive :-)


thanks, i added it to the contrib module in the sourceforge cvs, to be
discovered by future generations just as those prehistoric paintings were
discovered in the caves of Altamira. Five points to whoever can tell me what
album I've been listening to recently.

http://en.wikipedia.org/wiki/Altamira_%28cave%29

p.s. someone just asked me about making mon do something with snmp traps,
and i sent him this script.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: snmptrap2mon converter

2006-02-28 Thread Jim Trocki

On Tue, 28 Feb 2006, Eric Sorenson wrote:


Steely Dan, _The Royal Scam_ , I'm guessing.


you got it.


with the same trap2mon functionality. The routers and servers are
pointing syslog at a syslog-ng host, which lets you filter log
messages through a program in addition to writing them out.


so lars wrote that syslogd-to-mon thingy a while back, maybe some of the
code would be useful to rip off without shame^W^W^W^Wsynergistically leverage:

http://ftp.kernel.org/pub/software/admin/mon/contrib/utils/mon-syslog/

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: upalertafter

2006-02-16 Thread Jim Trocki

On Thu, 16 Feb 2006, Philippe Ferreira wrote:



So, which one is the best choice :  mon-0.99.3 or mon-1.0.0pre5 ?

Is mon-1.0.0pre5 pretty good ?


1.0.0pre5 or 1.1.0pre2 are both better than 0.99.2.


___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: MON_STATEDIR

2006-02-10 Thread Jim Trocki

On Fri, 10 Feb 2006, Nate Reed wrote:


Just wondering how this directory (/var/lib/mon) is supposed to be used?  Is
this used by mon or intended to be used by alerts/monitors to keep state
data?


Yes, both, and that's how it is intended to be used.


This seems very useful to have.  For example, I would like to keep a counter
or the number of alarms within a configurable amount of time.  Naturally,
this would need to be kept in a file in a directory somewhere.


Great idea! :)

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Monitor works from the command-line but not from mon

2006-01-28 Thread Jim Trocki

On Fri, 27 Jan 2006, David Nolan wrote:




--On Friday, January 27, 2006 15:25:26 -0500 Kishore Jalleda 
[EMAIL PROTECTED] wrote:



I agree with David's point,I am not a Mon pro but what I would suggest is
specifying only the alert part without the full path like this, and also
as he suggested check to see the output and the status codes when mon
starts


Oh!

I hadn't even noticed that...   thats exactly the problem.  The monitor 
option should have only the name of the script in the monitor directory, not 
the full path.


Ah, I agree, that's a good catch, and not at all intuitive. I'll fix
the code so that if a monitor has a leading / (i.e. looks like a full
path) it will use it as-is instead of trying to look it up as a file
or path relative to mondir.

I guess the same should be done for alerts.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: mysql monitor fails: _ListTables deprecated

2006-01-28 Thread Jim Trocki

On Fri, 27 Jan 2006, Nate Reed wrote:


I installed and configured mon 0.99.2 and I noticed the following errors in
the mysql monitor:


[...]


Yet the release notes for Jun 2004 indicate this has been fixed:

-mysql.monitor - fix for deprecation of _ListTables
by Aled Treharne

What's wrong?


Get a more recent version from here:

ftp://ftp.kernel.org/pub/software/admin/mon/devel/

This explains the versioning:

http://www.kernel.org/software/mon/development.html

However one little detail it doesn't mention is how both the 1.1 and the 1.0
are actually stable at this point, both of which are available in the devel
directory.

This shouldn't remain like this. We need to spank 1.0 and 1.1 on the butt, make
an announcement, make them official or whatever, stop speaking of 1.1 as
devel but instead as stable, and make the current 1.1 the starting
point for the new devel.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: how does exclude_period work?

2005-10-12 Thread Jim Trocki

On Tue, 11 Oct 2005, David Nolan wrote:

The segfault bug is trigger by calling a text parsing function (from a 
standard perl module, Text::Parsewords) with particulary large input.


It's a regexp that's in Text::Parsewords which chokes because the default
stack size on most systems isn't sufficient for it. I've seen it crash
one time, then when run after tweaking the stack size with ulimit,
it doesn't crash. Oh well. Intellectual curiosity satisfied, time to move on.


Jim was talking with me recently about actually designating something a 
stable version...  This seems like one more big reason to stop calling 0.99.2 
the stable version.  How about it Jim?  Call mon-1-1-0pre2 Mon 1.1 and cut a 
release?


Well, I like the release numbering convention that the Linux kernel uses,
where the first number to the right of the decimal point signifies a
stable release if it is an even number, or a development release if it
is an odd number.

I think we should just fork the cvs tree and call mon-1-1-0pre2 the
super fantabulous mon 1.2 (tag it as mon-1-2-0), then the head will
be 1.3, the work-in-progress, possibly unstable, possibly stable,
experimental-feature-laden code.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: ListTables is deprecated

2005-08-23 Thread Jim Trocki

On Tue, 23 Aug 2005, rueh hänä wrote:


Uhm... Dont know from where it is. But i found it with google i guess. This
is because i didnt find it in the contrib dir. I even dont find the
msql-mysql.monitor in contrib, has it been removed or am i blind?
I attached mysql.monitor of my server.


that is just a really old version msql-mysql.monitor! i don't know if you're
blind or not, but you didn't look in the obvious spot.  this script is part of
the mon distribution itself.  it's in the mon.d directory.

look at the comments at the top:

# $Id: msql-mysql.monitor,v 1.1.1.1 2004/06/09 05:18:04 trockij Exp $

# The single argument, --mode [msql|mysql] is inferred from the script name
# if it is named mysql.monitor or msql.monitor.  Thus, the following two are
# equivalent:
#
# ln msql-mysql.monitor msql.monitor
# ln msql-mysql.monitor mysql.monitor
# msql.monitor hostname
# mysql.monitor hostname


i suggest you download the latest version of mon, say, 1.0.0pre5. you can get
it from here:

ftp://ftp.kernel.org/pub/software/admin/mon/devel/

this has the fixed version of msql-mysql.monitor.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Is the archieve relocated?

2005-08-03 Thread Jim Trocki

On Wed, 3 Aug 2005, Tarak Patel wrote:


Hi all,

I'm a newbie to MON. I would like to view archived messages but the URL from 
http://linux.kernel.org/mailman/listinfo/mon is dead. Has the archive been 
moved to a newer location?


hm, i'll check into the mailman archive on kernel.org. fortunately,
mail-archive.com has an archive:


http://www.mail-archive.com/mon%40linux.kernel.org/

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Monitor to check I/O of /dev/ttyS0

2005-07-25 Thread Jim Trocki

On Mon, 25 Jul 2005, rueh hänä wrote:


Bullshizzle

Two mails ago i asked for commands, tools, whatever, that check the io of
the serial port. If you cant help, just give no answer instead of such
verdicts. I'm still in apprenticeship and i have to learn very much. You
seem to be an experienced person. And you surely know, if there were any
similar problems as mine. Or am i the only one, who wants to monitor io on
serial ports?
If anyone knows, how to realize that with another language than bash, im
open for hints..


--- Ursprüngliche Nachricht ---
Von: Jim Trocki [EMAIL PROTECTED]
An: rueh hänä [EMAIL PROTECTED]
Betreff: Re: Monitor to check I/O of /dev/ttyS0
Datum: Mon, 25 Jul 2005 08:17:38 -0400 (EDT)

On Mon, 25 Jul 2005, rueh hänä wrote:


Thanks for the tip...
But i dont want to learn a programming language for this. The server

should

be prepared for practice soon.. So i dont have time to learn a new
language..
I thought of a monitor in bash-script format.


ok, what do you want me to do, then? write it for you?




see:

Date: Sun, 24 Jul 2005 20:03:05 -0400 (EDT)
From: Jim Trocki [EMAIL PROTECTED]
To: rueh h#n# [EMAIL PROTECTED]
Subject: Re: Monitor to check I/O of /dev/ttyS0

On Sun, 24 Jul 2005, rueh h#n# wrote:

 But i dont know, how. Im still a beginner at the end of my
 apprenticeship..
 Are there some commands to check the IO, or some packages, that can do
 such
 things?

if you really want to get into unix programming, i would recommend the book
advanced programming in the unix environment, by w. richard stevens. the
publisher is addison wesley.

also, the o'reilly books learning perl and programming perl are
indispensible.

to which you replied:


Date: Mon, 25 Jul 2005 08:37:19 +0200 (MEST)
From: rueh h#n# [EMAIL PROTECTED]
To: Jim Trocki [EMAIL PROTECTED]
Subject: Re: Monitor to check I/O of /dev/ttyS0

Thanks for the tip...
But i dont want to learn a programming language for this. The server should
be prepared for practice soon.. So i dont have time to learn a new
language..
I thought of a monitor in bash-script format.


i gave you help.  i pointed you towards resources which would teach you how to
do what you're asking to do. maybe you could learn from these resources during
your apprenticeship. or maybe you're looking to install some software that
does everything you want to do. you didn't quite say in any level of detail
what you want to do, yet you tell me that you don't have time to learn
something new. well, sometimes one needs to learn new things in order to
solve their own problems.

*shrug*

mon comes with a monitor called dialin.monitor which does i/o on the serial
port in perl with a module from CPAN called Expect. you should have a look at
that. i've never done such a thing by using only bash.___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Monitor to check I/O of /dev/ttyS0

2005-07-22 Thread Jim Trocki

On Fri, 22 Jul 2005, rueh hänä wrote:


Hi all

Im wondering, if there is a possibility to create a monitor, that checks the
I/O of the serial interface.


yes, there is a 100% possibility :)
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


RE: Mon Debugging

2005-07-15 Thread Jim Trocki

On Fri, 15 Jul 2005, Aaron Segura wrote:


It's not that I'm paying attention to it, per say - It's just that I saw
it while looking through the debug logs and thought that it may be
useful for troubleshooting purposes...apparently it's not.


sorry if it was misleading, but i had put that counter there way back when i
was debugging some of the alertafter/alertevery mechanisms, and i needed to
number the loop iterations to follow things. it wasn't intended for anything
other than that.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


RE: Mon Debugging

2005-07-14 Thread Jim Trocki

On Thu, 14 Jul 2005, David Nolan wrote:

process) or hanging (due to alert issues with older mon), yes, the counter 
should go up approximately once a second.


i'm not sure why anyone is paying attention to that counter or how long it
takes to tick, since it's not used for anything other than to count how many
times the main loop has iterated.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: M4 Processing on config files

2005-07-13 Thread Jim Trocki

On Wed, 13 Jul 2005, Gerardo Arceri wrote:

Hi, I use m4 and mon -M option to make it process the config files with said 
m4, i use it to separate the diferent parts of the monitoring into diferent 
files and using include() to join them into the main config. I usually sent 
alerts to 4 or 5 guys and i need to repeat the alert and upalert statements 
for each one on each monitored service, I'd like to use m4 to put the alert 
configuration on a single file and include() it into each watch, problem is: 
does mon process with m4  just the main config file for macro expansion or it 
will also process the include()d file?


mon has no include. the include is part of m4's processing, so it should
perform macro expansion on all the files that you've included.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Can inverval go inside period stanza?

2005-06-28 Thread Jim Trocki

On Tue, 28 Jun 2005 [EMAIL PROTECTED] wrote:


Can interval go inside a period stanza - I need to run tests at
different rates, depending upon time of day and day of week.  For
example:


no, it can't, but that wouldn't be a difficult feature to add.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: nfs monitor exisiting?

2005-06-07 Thread Jim Trocki

On Tue, 7 Jun 2005, Ed Ravin wrote:


On Tue, Jun 07, 2005 at 12:44:02PM +0200, Gilles LAMIRAL wrote:

Is there any montior existing, where i can check if
the nfs service is correct running on my servers ??


Here is the monitor :

ls -d /mnt/dir1/* /mnt/dir2/* ...

where dir1 dir2 ... are the mounted mount points.
And this is not a joke :-)


But if the NFS server is not responding, that command will hang.  Forever.


not if you mount the nfs volumes you are monitoring with the soft
option, which will make syscalls accessing those volumes return with an
i/o error if there is a major timeout. i've done this before and it works
well.

___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: process.monitor problems

2005-04-25 Thread Jim Trocki
On Mon, 25 Apr 2005, Patrick Marquetecken wrote:
Hi, when i run
./process.monitor -c More_AcceSS_on_5353 host 10.32.3.51
i get this error:
Can't locate object method new via package SNMP::Session (perhaps you
forgot to load SNMP::Session?) at ./process.monitor line 50.
I use this for testing before i modify my min.cf, how can i solve this ?
it sounds like you don't have net-snmp's SNMP perl module installed.
you can get the latest from net-snmp.org. the installation instructions
are in the tarball.
fedora core 3 ships the rpms, so if you're using linux on an rpm-based
system you may want to try to install that instead.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: ps.monitor

2005-04-25 Thread Jim Trocki
On Mon, 25 Apr 2005, Allan Wind wrote:
Btw, how come the contrib directory does not include a ps monitor?
because you didn't send us one yet!
I hacked one up in bash that I would be happy to contribute (BSD, X11 or
similar free license, but perl would probably be better for portability
and I think that has been done before.
ok, post it to the list when it's ready.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Monitor output size limitation

2005-04-11 Thread Jim Trocki
On Mon, 11 Apr 2005, Ed Ravin wrote:
On Fri, Apr 08, 2005 at 06:33:38PM -0700, Jim Trocki wrote:
On Fri, 8 Apr 2005, David Nolan wrote:
This is a known bug with some regexps in perl's Text::ParseWords that is
tickled by large input from mon.
well it's not really a bug, it's just that the default stack size is
inadequate for regexps in that module. bump up the stack allocation with
uname -s and you'll see the problem vanishes.
I think you meant to say ulimit -s :-)
yeah, one of those u.* commands. keep trying all of them until you find
one that works :)
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: trap received but not acted upon.

2005-04-09 Thread Jim Trocki
On Sat, 9 Apr 2005, David Nolan wrote:

--On Friday, April 08, 2005 5:47 PM -0700 Jim Trocki [EMAIL PROTECTED] 
wrote:

you need to use a valid period definition, i.e. something that is
meaningful to Time::Period, such as wd {Sun-Sat}. try this:
I don't think thats his problem.  An empty period definition is valid, it 
matches always.  Mon handles this correctly.
oh. that's busted. i never realized that was the case, nor intended it
to be so.  just reviewed the pod page for Time::Period, and i see it
does say mention that a valid period string is whitespace, but it
doesn't say what it means. from testing the code it does return true
when you give it an empty period string. i'm inclined to make mon treat
the empty string as an error, since its meaning is ambiguous according
to the documentation, and on principle.

The problem is here:
  opstatus = unknown,
If he's using Mon 0.99.2 (which he is, the particular error message he 
reported doesn't exist in the current code), that will cause exactly this 
error.  If he's using either the latest 1.0 or 1.1 pre release
yeah, the thing to do is not use 0.99.2, but use 1.0 or 1.1 instead.
the trap handling code in the newer versions is more robust.
to reiterate and clarify, the opstatus arg in send_trap (the spc
variable in the protocol) is ignored by the mon 1.0 and 1.1 servers. in
0.99.x, opstatus sort of gets converted into a real return value via
this goofy mechanism that i don't even think i understand any more.
actually, i don't think i ever understood.
if using 1.0 or 1.1, setting retval in send_trap (sta in the protocol)
to nonzero is what the trap sender should do to indicate failure, since
that's what really gets used. retval serves the same exact purpose in a
trap as does the process exit value in a monitor, i.e. exits with 0 for
success, nonzero for a failure. the whole trap opstatus thing should
be removed from the mon and Mon::Client code, since it's a remnant from
back when i was indecisive about how traps should work.
Hans, I suggest you should set this to either 'ok' or 'fail', depending on 
the trap you're processing.  Or just upgrade to a newer mon and be happier. 
:)
yes, upgrade, because if he finds another quirk with 0.99.2 it won't
get fixed. most of the trap code between 0.99.2 and 1.x was rewritten,
anyway.
http://www.kernel.org/pub/software/admin/mon/devel/
mon-1.0.0pre5 or mon-1.1.0pre1, both paired with mon-client-1.0.0pre2,
or get it from cvs:
http://www.kernel.org/software/mon/development.html
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: trap received but not acted upon.

2005-04-09 Thread Jim Trocki
On Sat, 9 Apr 2005, David Nolan wrote:
In the documentation for Time::Period, right after it says whitespace or the 
string 'none' are legal it says:

If the period is blank, then any time period is assumed because the

but... but... it didn't sat that the last i read it, i swear!
in that case, it makes sense, though i would have never thought of
using an empty period string. that's probably why it seemed so odd.
sorry for making the confusion.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: trap received but not acted upon.

2005-04-08 Thread Jim Trocki
On Fri, 8 Apr 2005, Hans Fugal wrote:
Maybe I'm doing something wrong, because I see this in the logs but no
alert is ever sent:

watch default
   service default
   description Default trap service
   period
   alert mail.alert fugalh
you need to use a valid period definition, i.e. something that is
meaningful to Time::Period, such as wd {Sun-Sat}. try this:
watch default
   service default
   description Default trap service
   period wd {Sun-Sat}
   alert mail.alert fugalh
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Monitor output size limitation

2005-04-08 Thread Jim Trocki
On Fri, 8 Apr 2005, David Nolan wrote:
This is a known bug with some regexps in perl's Text::ParseWords that is 
tickled by large input from mon.
well it's not really a bug, it's just that the default stack size is
inadequate for regexps in that module. bump up the stack allocation with
uname -s and you'll see the problem vanishes. but it's better to have
fixed the glitch with changing the code than expecting that people run
with a modified stack size :)
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: snmp traps

2005-04-06 Thread Jim Trocki
On Wed, 6 Apr 2005, Hans Fugal wrote:
Is there a magic wand to wave to get snmp traps?
no, that code doesn't work, since it was never completed. in fact,
any hints of it have been removed from the latest 1.0.0pre* and 1.1.*.
at this point your best bet would be to write a wrapper script for
snmptrapd which turns the snmp trap into a mon trap.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: unidentified output from fping

2005-04-05 Thread Jim Trocki
On Tue, 5 Apr 2005, Hans Fugal wrote:
I just discovered mon. There's a few words to describe mon. Epiphany,
nirvana, elation to name a few. So thanks a million for all the hard
work.
wow, that's over the top!
I'm getting the following alert from an fping.monitor service:
...
--
unusual errors
--
unidentified output from fping: [172.16.60.10 : duplicate for [0], 84 bytes, 326
+ms]
which version of fping? i think the newer versions might have updated
their output, and fping.monitor doesn't understand the newer messages.
i'll see if i can change fping.monitor to understand the new output.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Example mon.cf has misleading Time::Period specification

2005-03-31 Thread Jim Trocki
On Thu, 31 Mar 2005, Michael Vogt wrote:

In example.m4, the following is misleading because it implies the two
time periods do not overlap.
define(_OFF_HOURS_, `wd {Mon-Fri} hr {10pm-7am}, wd {Sat Sun}')dnl
define(_WORK_HOURS_,`wd {Mon-Fri} hr {7am-10pm}')dnl

The example should be changed to:
define(_OFF_HOURS_, `wd {Mon-Fri} hr {10pm-6am}, wd {Sat Sun}')dnl
define(_WORK_HOURS_,`wd {Mon-Fri} hr {7am-10pm}')dnl
yes, i see your point.
or...
define(_OFF_HOURS_, `wd {Mon-Fri} hr {10pm-7am}, wd {Sat Sun}')dnl
define(_WORK_HOURS_,`wd {Mon-Fri} hr {8am-9pm}')dnl
i'll make the correction.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Bug: mon.cf keyword error in period section not detected

2005-03-25 Thread Jim Trocki
On Thu, 24 Mar 2005, Michael Vogt wrote:
Not sure if this has been reported.
It is not fixed in mon-1.0.0pre5.
ok i fixed this and the previous problem you reported in the 1.0.0 branch.
thanks.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Mon and non-failure logging

2005-03-25 Thread Jim Trocki
On Fri, 25 Mar 2005, David Nolan wrote:
Or you could use 'redistribute foo.alert', available in mon-1.1.0pre1 which 
causes the configured alert script to get called for every status update.
i've been working out a feature which does this but per-period rather than 
per-
service. this method seems just a bit more flexible.
in the longer term, the logging issue should be solved differently. i've been
pondering a way to connect global/service/period events to modular loggers,
which would optionally be persistent. i.e. when mon starts up, it will fire up
the various loggers if need be, they will hang around and wait for log messages
to come from the mon server via a tcp socket, a named pipe, whatever. this
would allow you to have a logger that, once started, it connects to some
database such as postgres or mysql, stays connected, and uses that as the
back-end for logging.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Upgrading to 1.1.0.

2005-03-10 Thread Jim Trocki
On Thu, 10 Mar 2005, David Nolan wrote:
$from isn't a simple string, you need to do something like:
yes, so i plucked code from mon to make this separate test program
which receives traps and decodes what recv returns, and it does
just as i expect:
#!/usr/bin/perl
use Socket;
$bindaddr = INADDR_ANY;
$udpproto = getprotobyname ('udp');
socket (TRAPSERVER, PF_INET, SOCK_DGRAM, $udpproto);
bind (TRAPSERVER, sockaddr_in (2583, $bindaddr));
$a = recv (TRAPSERVER, $buf, 65536, 0);
($port, $addr) = sockaddr_in ($a);
$addr = inet_ntoa ($addr);
print port=$port\n;
print addr=[$addr]\n;
send a trap to that, and you'll see it report something like
this:
$ ./tst
port=33002
addr=[127.0.0.1]
dunno, i'll have to look at it harder.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Passing variable from foo.monitor to foo.alert

2005-01-18 Thread Jim Trocki
On Tue, 18 Jan 2005, Alexandre Pashai wrote:
hello!
I need a result found in foo.monitor to be displayed in foo.alarm...
How can i pass custom variables from monitor to alarm context ??
all of the data sent to stdout by a monitor is sent to the alert's
stdin, so if you're writing a custom monitor and alert, you can decide
however you'd like to encode those variables and values. e.g., make the
monitor output lines which look like VnameA=valueA, and use a regexp
in the alert to look for /^V(nameA)\s*=\s*(.*)$/ on stdin to identify
the variables.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Passing variable from foo.monitor to foo.alert

2005-01-18 Thread Jim Trocki
On Tue, 18 Jan 2005, Alexandre Pashai wrote:
It seems not to be correct:
in monitor, if do echo doggy bag, i get the following in alarm stdin ($*)
$* = -s X -g  -h 148.25.32.65 -t 110002124515 148.25.32.65
the second 148.25.32.65 is from the mon.cf (alarm line arg)
So no trace of my custom string doggy bag from stdin (alarm)
trust me, this is how it works. i suspect you have a problem with your
script. $* is not stdin, it's the positional parameters passed on the
commandline. if you're writing this in /bin/sh then the monitor would
read stdin like this:
while read a
do
echo a=$a
done
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: Help with mon.cf switches

2005-01-05 Thread Jim Trocki
On Wed, 5 Jan 2005, Craig Reeson wrote:
Guys,
I'm new to Mon and have taken over a non working install of Mon which I
desperately need to get working...
Anyway, what does the -P option mean/do?
Ie. monitor process.monitor -P augw -C /etc/mon/process.monitor.conf
for process.monitor, -P does nothing. i'm not sure what you're trying to
do or from where you got that example, but if you elaborate then maybe i
can help out.
___
mon mailing list
mon@linux.kernel.org
http://linux.kernel.org/mailman/listinfo/mon


Re: fping.monitor improvement

2004-12-13 Thread Jim Trocki
On Mon, 29 Nov 2004, Ed Ravin wrote:
I needed fping to use a larger packet size in order to monitor when a
tunnel loses the ability to pass full-sized packets.
simple enough, thanks. i committed that change to the mon-1-0-0pre1 branch.
___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


Re: alerts functionality

2004-11-23 Thread Jim Trocki
On Fri, 19 Nov 2004, Joubin Moshrefzadeh wrote:

 host1 goes down - 1 alert sent
 then host2 goes down - 2 alerts sent
 then host3 goes down - 3 alerts sent
 etc...
 
 so total alerts sent is 1+2+3...+10?
 
 is the latter correct? I've only tested it up to two hosts going down 
 consecutively :)

it's correct depending on how you configure mon. this is the default
behavior, but you can change it.

i noticed the man page needed some updating, so i did so and check in the
changes to the cvs tree on the mon-1-0-0pre1 branch. the part which affects
this behavior is the alertevery parameters.  here's a summary:


ALERT DECISION LOGIC
   Upon a non-zero or zero exit status, the associated  alert  or  upalert
   program (respectively) is started, pending the following conditions: If
   an alert for a specific service is disabled, do not send an alert.   If
   dep_behavior  is  set  to 'a', and a parent dependency is failing, then
   suppress the alert.  If the alert has previously been acknowledged,  do
   not send the alert, unless it is an upalert.  If an alert is not within
   the specified period, record the failure via syslog(3) and do not  send
   an alert.  If the failure does not fall within a defined period, do not
   send an alert.  No upalerts are sent without corresponding down alerts,
   unless no_comp_alerts is defined in the period section. An upalert will
   only be sent if the previous state is  a  failure.   If  an  alert  was
   already  sent  within  the last alertevery interval and the monitor has
   continued to report a nonzero exit status for a time period  less  than
   that  interval,  do  not  send another alert, unless the summary output
   from the most recent monitor process differs from the previous.  Other-
   wise,  send  an  alert using each alert program listed for that period.
   The observe_detail argument to  alertevery  affects  this  behavior  by
   observing  the  changes in the detail part of the output in addition to
   the summary line.  If a monitor has successive failures and the summary
   output  changes  in each of them, alertevery will not suppress multiple
   consecutive alerts.  The  reasoning  is  that  if  the  summary  output
   changes,  then  a  significant  event  occurred  and the user should be
   alerted.  The ignore_summary  option  will  suppress  all  successive
   alerts  while the service continues to fail, even if the summary output
   changes.  If the strict alertevery option is used,  then  behave  the
   same  as  if  ignore_summary was set, but do not reset the alertevery
   timer when  the  monitor  exits  with  a  zero  status.   For  example,
   alertevery  24h  strict  will  only  send  out an alert once every 24
   hours, regardless of whether the monitor output changes, or if the ser-
   vice stops and then starts failing.

...

   alertevery timeval [observe_detail | ignore_summary | strict ]
  The alertevery keyword (within a period  definition)  takes  the
  same  type  of argument as the interval variable, and limits the
  number of times an alert is sent when the service  continues  to
  fail.   For example, if the interval is 1h, then the alerts in
  the period section will only be triggered once every hour as the
  service  continues  to fail.  The alertevery interval timer will
  be reset if the monitor stops exiting with a nonzero exit status
  (i.e. it reports a success).  If the alertevery keyword is omit-
  ted in a period entry, an alert will be sent out  every  time  a
  failure  is  detected.  By default, if the summary output of two
  successive failures changes, then  the  alertevery  interval  is
  overridden,  and  an  alert  will be sent.  The ignore_summary
  argument   suppresses   this   behavior. Ifthestring
  observe_detail is the last argument, then both the summary and
  detail output lines will be considered when comparing the output
  of  successive  failures.   If  the  string strict is the last
  argument, then the output of the monitor or the state change  of
  the  service  will  have no effect on when alerts are sent. That
  is, alertevery 24h strict will send only one  alert  every  24
  hours, no matter what.  Please refer to the ALERT DECISION LOGIC
  section for a detailed explanation of how alerts are suppressed.


___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


Re: mon logging setup problem

2004-11-15 Thread Jim Trocki
On Mon, 15 Nov 2004, Shea Frederick wrote:

 Fixed that, but still not creating a log file.
 
 logdir = /var/log/mon
 dtlogfile = dtlog

do you see messages from mon in your /var/log/messages file, or wherever you've
told syslog to send stuff from facility daemon priority
err/info/notice? that's where failure events are logged.

dtlog is only appended to when a service comes back up after being down.
otherwise you don't know how long it was down for, so you can't write the log
saying how long it was down for.

___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


Re: hpov.alert example

2004-08-18 Thread Jim Trocki
On Wed, 18 Aug 2004, Byron Emerson [mon] wrote:

 Hello,
 
 Does anyone have a working example of an hpov.alert


you mean this one doesn't work, or do you not know about it?

http://ftp.kernel.org/pub/software/admin/mon/contrib/alerts/hpov/

___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


mon-1.0.0pre4 available

2004-08-03 Thread Jim Trocki
the latest pre-1.0.0 release is available from:

ftp://ftp.kernel.org/pub/software/admin/mon/devel/

1.0.0 is getting close now. in fact, we've never been closer.  out of
all the pre-1.0.0 releases, this has been the closest. the closeness of
this release to 1.0.0 surpasses all previous releases.

i welcome all feedback. in fact, you're welcome to join the development
mailing list if you'd like to participate in heated discussions about
bugs, features, and misfeatures. info about the list is found here:

http://lists.sourceforge.net/mailman/listinfo/mon-devel


Changes between mon-1.0.0pre3 and mon-1.0.0pre4
Tue Aug  3 08:02:35 EDT 2004
---

-when allow_empty_group is not set and no host arguments
 to pass to a monitor, the interval wasn't being reset so
 it would spam the syslog with lots of no host arguments
 messages. this is fixed.

-in reset_timer, there was a chance that _timer could get
 set to a negative value, which is not right. fixed it.

-fixed the bug where lots of mon processes could accumulate if the
 exec of an alert failed. also fixed error handling of failed
 alerts.

-added show failures only button to mon.cgi to speed it up.
 by Ed Ravin [EMAIL PROTECTED]

-small permissions fix to rpm spec file

-added MON_CFBASEDIR variable to monitor and alert
 environment, which is set to the value of cfbasedir in the
 config file.

-removed unfinished snmp trap handling stuff. it doesn't work at all,
 and it's misleading to people even though the man page says it doesn't
 work.

-added monitor_duration and monitor_running output to opstatus detail
 in monshow


___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


Re: Question ...

2004-07-14 Thread Jim Trocki
On Mon, 12 Jul 2004, Scott A. Davis wrote:

 Maybe I have overlooked something, but is there a way to completely 
 flush all outage history logs, etc... so I can start fresh?  I have been 
 doing some testing, and there are a lot of 'planned' outages showing up.  
 Now I am ready to start fresh, and would like to clear the log history.

stop the mon server, remove the dtlogfile, then restart the mon server.

___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


mon-1.0.0pre3 available

2004-07-12 Thread Jim Trocki
for those who would like to try the latest pre-release of mon 1.0.0
(tagged mon-1-0-0pre3 in CVS), you may obtain it here:

ftp://ftp.kernel.org/pub/software/admin/mon/devel/

the most important update is the consolidation of the monitor exit
processing and the trap/traptimeout handling code which has resulted
in more sensible code and a bunch of bug fixes with respect to trap and
traptimeout handling.

this should be paired with the mon-client-1.0.0pre2 perl module (there
have been no updates to that since 1.0.0pre2).

Changes between mon-1.0.0pre1 and mon-1.0.0pre3
Mon Jul 12 09:12:29 EDT 2004
---

-changed README to refer to the new, more sensible name for the perl module
 client, which is mon-client

-applied eric's updates to INSTALL and added a mention of monshow and mon.cgi as
 the web interfaces

-added eric's rpm spec file (i removed the patches because they are no longer
 needed)

-added lmb's syslog.monitor (a nifty hack)

-added 'alertevery strict' code and docs, updated the README and INSTALL to
 mention CVS, updated CREDITS

-incorporated mon.cgi 1.52

-minor addition to alert behavior explanation in mon.8

-in dialin.monitor.wrap.c, return the exit status of execv (if it fails, that is)

-fixed path to perl in file_change.monitor and smtp3.monitor

-added some rcs tags to identify the file versions

-handle_trap_timeout now calls process_event, and it works fine with
 alert/upalert/alertevery/etc. as shown by my testing

-received traps now reset the trap timeout counter, and fixed some
 other stuff wrt trap timeouts

-added sub process_event and made proc_cleanup and handle_trap use it
 so that the alert mgmt code is shared rather than in two places. i tested
 as much of it as i could and all seems to work well now, especially
 upalert, alertafter, alertevery with traps.

-added per-service _monitor_duration variable which records how many
 seconds the previous monitor took to execute. this is available via
 list opstatus. if no monitor has executed yet then the value is -1.

-added per-service _monitor_running variable whose value is 0 or 1
 depending on whether the monitor is currently running for that service.

-removed gunk from handle_trap regarding the various TRAP_COLDSTART, etc.
 processing, since most of it was a bad idea anyway, or at least as far as
 i could tell. traps and their exit values are now processed exactly as
 monitors are, which simplifies things greatly and adds to more intuitive
 functionality. this means the spc value in a trap is now ignored.

-fixed some args processing in call_alert

-fixed a bug which would prevent alerts or upalerts
 from being sent when call alerts is passed the output
 argument whose value is undef

-remove usage of parse_line in trap processing
 (backported from mon 1.1 code)

-make esc_str escape spaces in order to be compatible with monperl-1-0-0pre1

-added list of all possible client commands to moncmd

-added --community to set the snmp community in reboot.monitor

-patch to traceroute.monitor from meekj
added StateDir, TracerouteOptions, StopAt config options
some bugfixes to config file parsing
reap children to avoid defunct processes
added timeout alarm

-up_rtt.monitor
 added -r to log individual rtts, better error reporting for tcp and udp check


___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


Re: Best way to contribute

2004-06-29 Thread Jim Trocki
On Tue, 29 Jun 2004, Peter Wirdemo (MO/EMW) wrote:

 Hello all!
 
 What is the best way to contribute, now when mon is moving over to sourceforge.
 My intreset in contributing is most in monitors, not so much in mon itself.

i've made the [EMAIL PROTECTED] mailing list for this purpose:

http://lists.sourceforge.net/mailman/listinfo/mon-devel

(it'll take a little bit for it to show up)

___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


two new mailing lists, mon-devel and mon-commit

2004-06-29 Thread Jim Trocki

fyi, i've created two new mailing lists managed by lists.sourceforge.net:


mon-devel:  discussions regarding mon development and innards,
patch submissions, etc.

https://lists.sourceforge.net/mailman/listinfo/mon-devel

mon-commit: cvs commit logs

https://lists.sourceforge.net/mailman/listinfo/mon-commit


___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


Re: Why doesn't _trap_timer get reset?

2004-06-28 Thread Jim Trocki
On Mon, 28 Jun 2004, Jim Trocki wrote:

 #
 # a trap recieved resets the trap timeout timer
 #
 if (exists $sref-{traptimeout)


oops, forgot to close that {, should be:

if (exists $sref-{traptimeout})


___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


Re: Why doesn't _trap_timer get reset?

2004-06-28 Thread Jim Trocki
On Mon, 28 Jun 2004, David Nolan wrote:

 While it doesn't add any bugs, I don't believe it fixes any either. 

it does indeed fix the bug where a received would not reset the _trap_timer,
preventing traptimeout from working at all. i've tested it and it works
properly now.

 Careful reading of the code makes it clear that _trap_timer is only ever 
 relevant after a timeout has already occurred.
 It prevents a timeout alert from happening on every pass through the code.

that is not true. _trap_timer is what counts down timeout counter in the first
place. it is what gauges whether or not a timeout has occurred. once a
timeout happens, as indicated when _trap_timer drops to zero or below, is
that do_alert is called and _trap_timer is then reset to the value of
traptimeout, and it starts counting down again.

what's supposed to prevent _trap_timer from hitting 0 in the first place is
the reception of a trap, and that is what was broken, and the patch i posted
fixes that.

 Therefore the way its implemented is confusing.

ok, i think you may be thinking of some other functionality that i don't know
about.

 If we're going to fix this, I'd do one of two things:
 
 Option 1: make _trap_timer entirely responsible for all trap timeouts by 
 resetting it when a trap is received OR when a timeout happens, and only 
 testing _trap_timer when determining whether or not a timeout has occurred.

well, that's what it does now with this litte patch to handle_trap.


___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


RE: Mon Question ...

2004-06-27 Thread Jim Trocki
On Sat, 26 Jun 2004, Scott A. Davis wrote:

 Hmmm... Ok.  What am I doing wrong?  My head hurts.
 
 I have the files (at least for testing purposes) 
 
 /cgi-bin/monshow.cgi  -rwxr-xr-x
 /cgh-bin/monshowrc-rwxr-xr-x

monshowrc shouldn't be executable, and it's a bad idea to put
files which are not meant to be executed in your cgi-bin path.

 /cgi-bin/mon-lib.pl   -rwxr-xr-x

i've never heard of this.

 In monshow.cgi, I have the line:
 
   my $VIEWPATH = /var/www/html/cgi-bin

 mon-lib.pl is also in the same dir, as required by monshow.cgi line 26:
 require './mon-lib.pl';

i don't know what monshow you're using, but there has never been such a
thing as mon-lib.pl, and monshow has never required it. it does, however,
require you to install the Mon::Client perl module, which you can get
here:

ftp://ftp.kernel.org/pub/software/admin/mon/Mon-0.11.tar.gz

the monshow you should be using comes from the clients directory in
the main mon tarball.

   does not seem to work.

what does apache's error_log say? usually that is in /var/log/httpd.


___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


Re: Mon Question ...

2004-06-26 Thread Jim Trocki
On Sat, 26 Jun 2004, Scott A. Davis wrote:

 If I pull up the standard mon.cgi interface, all devices are listed (as they
 should be).  My question is: Is there a way to parse out the devices on a
 department-by-department basis so that whenever the Payroll department goes
 to mon.cgi, they see ONLY their devices?  

i don't think mon.cgi can do this, but monshow can via the views mechanism.
you'd put the view in the directory /etc/mon/monshow (say you named the file
test) and then query the url like: http://monhost/monshow.cgi/test

the details are in the man page. there is also a sample monshowrc in the
etc/ directory.


___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


Re: Required Temperature Sensor(s) for Mon?

2004-06-25 Thread Jim Trocki
On Fri, 25 Jun 2004, Peter Curran wrote:

 Hello All:
 
 I'm a Newbie; thanks for the forum. I'm using compaq ML370 G3 (Debian3.0r2) and 
 would like to use mon.
  
 Could somebody please suggest to me which Temperature Sensor(s) one could best use 
 with mon.
 Are there Network Cards(etc.) that could carry such Temperature Sensors

all the dallas semiconductor 1-wire (ibutton) stuff works fine with unix.
the rs232 interface i've used is the DS9097U-009, and i've used both a
thermochron ibutton with it and


http://www.ibutton.com/products/readers.html

the thermochron starter kit is good and it includes all the stuff you need:

http://www.ibutton.com/ibuttons/1921Kit.html

if you want to rig up remote thermistors you should look into the
DS1920/DS1820/DS18S20 devices.

on the ibutton web page there is a link somewhere to a bunch of code to
manipulate these devices. there is a temp.c program which reads the
temp of a DS1920, so you could wrap that up in a mon monitor script to
do the temp monitoring you need.

i used to use a little gadget from spiderplant.com for this purpose,
but it broke and they stopped selling them, anyway. the DS9097U-009 is
less expensive and serves the same purpose.


___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


Re: broken no_comp_alerts

2004-06-24 Thread Jim Trocki
On Wed, 23 Jun 2004, Daniel Fenert wrote:

 Recently I was flooded by upalerts, and found the cause, I think that
 no_comp_alerts was made up in mind, but wasn't finished in the code :)
 
 Here's the patch (aplies cleanly on -47, I haven't checked other realeases)

thanks. i've merged them into the mon-1-0-0pre1 cvs branch, and with
the testing i've done it seems to work better.


___
mon mailing list
[EMAIL PROTECTED]
http://linux.kernel.org/mailman/listinfo/mon


  1   2   >