Re: New idea / database question....

Dick de Waal Wed, 03 Oct 2001 05:48:36 -0700

Andrew,
Yes i agree on "the PITA stuff is building a good reporting interface,
good presentation, and meaningful graphs..", and of course dtquery
runs also on my monitoring-site...
We also run bug-zilla for mon-alerts; very nice; but if alerts being
recovered automaticly
(a lot will do...lucky us... > 90%) then there is no need to focus on this
problem anymore...
only whenever the same problem returns a lot, we use bug-zilla for normal
bug-handeling...
One pita is to tell bug-zilla how to use an UP-ALERT.....wich is no bug but
a bug-fix....
(give the initial ALERT (=bug) the fixed/closed status)


I'm trying to get not only up- and downreports, but also other few
performance
data like CPU/DISK/NET/PROCESS(-load) (mostly to get from SNMP agents i
think).
(I could try -at first- within a SNMP-monitor-script to dump the data from
each tested host
also in a DB, after the normal MON testing part)
With this, i can also check/report the server's resources after some time
being operational.
If some resources for example running out of limits within a sertain time,
the need for a second
'web-server' could be there... We use the 70% policy here; that means,
whenever a server reaches
concurrently above 70% recource use, without any reason/errors, we normally
add extra servers or
extra resources (like RAM or DISK).

Behind al this, of course i could just setup RRDTOOL / MRTG or other tools
to graph such data BUT:
with a monitoring site of 200+ servers with at least 10 performance counters
(cpu/disk/net/etc) you will
get out of the box 200 * 10 = 2000 webpages to check AND * 4
(day,week,month,year) = 8000 graphs....
Also the same data will be queried twice (MON and a extra collector
daemon)....
And it's not possible to make some other manual/user queries.

The last part is to create some meaningfull report with up-down time;
resource use and performance together
bundled/compacted in 1 or 2 enduser (paper)pages... (ok, something everybody
wants..... but after 6 pilots of
commercial products we think this is as of today not available....; again we
returned to open source and
use 'do it yourself policy'.....)
If al needed data is located in a DB, then it's possible to create this..
(with use of a MS Access (ODBC) frontend, web frontend or automated process
which emails the results for ex.)

Well, anybody an other point of view??
This is my goal to reach somehow... with or without extra daemons beside
MON;
Could also be called 'many to one' data-processing??
greetzz
dick
<>

----- Original Message -----
From: "Andrew Ryan" <[EMAIL PROTECTED]>
To: "Dick de Waal" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Saturday, September 29, 2001 00:11
Subject: Re: New idea / database question....


> So, first off, let me point you to dtquery, in case you're not familiar
with
> it. It's a tool I wrote that does a lot of what you're asking for, except
the
> relational DB part (which really is not very hard, the PITA stuff is
building
> a good reporting interface, good presentation, and meaningful graphs).
> http://www.nam-shub.com/files/dtquery/
>
> There are a couple of fundamental problems here in creating such a system
> with today's mon:
> 1) The information which mon's monitors gives you varies widely in format,
> both in the summary line and especially in the detail area. Effectively
this
> makes it impossible to get accurate information on a per-host basis
through a
> wide variety of alerts. This is planned to change in post 1.0 versions of
mon
> (1.1/1.2?) to allow for better reporting, especially on a per-host basis.
I
> guess this isn't a problem if you have 1:1 hostgroup/host mapping, but
that
> is not common for larger installations.
>
> 2) If you use mon's dtlog to get your downtime information, you miss
certain
> things (detail output from alerts, unchecked services, and services which
go
> down but don't come up before mon is reset or shut down).
>
> This also affects any attempt to build a meaningful database of downtime
> based on alerts. I considered this but didn't have the time and energy to
> work on something which, at best, would still be a hack.
>
> You can use something like the bugzilla_alert script to log alerts into a
> bugzilla database, which will give you some reporting/querying
capabilities.
> That's probably the cheapest, easiest way to get mon info into a
relational
> DB.
>
> rrdmon probably comes closest to getting the most accurate data, because
it
> queries a running mon server for status every 30-60 sec, but does so with
a
> very high performance penalty (it consumes a lot of CPU) and doesn't have
any
> advanced querying features. It also doesn't let you get any more info out
of
> the alert and monitor scripts other than "hostgroup up" or "hostgroup
down".
>
> I've heard of people that leverage their mon-generated rtt logs into a DB
of
> some sort and do stuff with them, but none of that has been released to
the
> public.
>
> Other than that, my plans are to basically wait until the next version of
mon
> allows us to do the job better from the ground up. But if anyone is going
to
> work on something in the meantime, feel free to use me as a resource.
>
>
> andrew
> On Friday 28 September 2001 07:26 am, Dick de Waal wrote:
> > Thanks for the replies in the first place!
> > Let me explain again;
> >
> > Of course i'm using MRTG / RRDMON / RRDTOOL / and other good
graphers...!
> > They can give a perfect graphical overview.. but there is MORE
> > informational (performance) data
> > used within mon which is only used for the monitoring part and after
that
> > (good or false) it's trown away....
> >
> > Scott points it right, (i agree also)
> > "> Mon is an excellent monitoring tool, as it was designed to be;  that
> > doesn't
> >
> > > necessarily make it an excellent tool for measuring performance,
however.
> > > I'd prefer that core development of mon continue in the direction of
> > > monitoring state and sending notifications, and leave the task of
> >
> > graphing,
> >
> > > reports, etc. to other tools."
> >
> > Just to leave mon for monitoring, states and alerts; the need for a
global
> > data dbms storage
> > is there......!! Once the data is in an dbms, seperate tools can handle
> > this data without disturbing mon....
> >
> > Which data i'm talking about??? Not the complete SNMP MIB-tree :-)) for
> > example, but just the data which mon
> > is already monitoring for a correct operation! For example:
> > fping.monitor results; snmpvar.monitor results (cpu/process/disk/etc),
> > http_xxx.monitor results, etc, etc...
> > Just those data, need to be analysed in real time for correct operation
> > (=mon) BUT is also important to get
> > reports (trends / pro-active / average service-level reached / etc)
after
> > running x time.
> >
> > This prevents this data being retrieved twice, and why not?? It
> > could(/should) be the last step in each monitoring
> > script to store the retrieved monitoring-data into a database and
nothing
> > more....!
> >
> > This idea is just to prevent mon being used as a performance
> > grapher/reporter and
> > other _hacks_.... let different programs do the job...!
> >
> > anyone some perl --> postgresql commands/script-lines for me??
> > greettz
> > dick
> > <>
> >
> > ----- Original Message -----
> > From: "Scott Prater" <[EMAIL PROTECTED]>
> > To: "Dick de Waal" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
> > Sent: Friday, September 28, 2001 10:36
> > Subject: RE: New idea / database question....
> >
> > > Right now, we're using a combination of mon with HP-OV to cover
> > > monitorization.  I'm looking into RRDTool for reporting.
> > >
> > > After spending a couple of years studying the topic, I've finally
found
> > > it useful to separate (mentally) the task of monitoring state from the
> > > task
> >
> > of
> >
> > > measuring performance. Basically, what you're asking are two different
> > > questions:
> > >
> > > *  Is my system OK right now? (fundamentally a yes/no question,
although
> > > there are many degrees of "yes, but...")  This is monitorization of
> > > state. Tools such as Tivoli, HP-OV, mon, Big Brother, etc. focus on
> > > answering
> >
> > this
> >
> > > question.
> > >
> > > *  How is my system performing overall? (well, poorly, sometimes
better
> > > than other times, etc.)  This is monitorization of performance over
time,
> > > usually shown in graphs.  Tools such as MRTG and RRDTool focus on
> >
> > answering
> >
> > > this question.
> > >
> > > Of course, the lines blur, especially when you talk of what different
> > > products provide -- the answer to the second question can determine
the
> > > answer to the first.  Tools such as MRTG provide limited threshold
> >
> > checking
> >
> > > and notification, but they are no substitute for a full-featured
> >
> > monitoring
> >
> > > system, such as mon.  On the other hand, tools such as mon can be
adapted
> >
> > to
> >
> > > save state information for reporting purposes (with modules such as
> >
> > rrdmon),
> >
> > > but they're no substitute for reporting tools such as RRDTool.
> > >
> > > So far, I haven't found a reasonably-priced (or freeware) package that
> >
> > does
> >
> > > it all to my taste.  Tivoli and HP-OV come close, but they still are
> >
> > focused
> >
> > > more on the first question, rather than on the second.  There's the
> >
> > OpenNMS
> >
> > > project (http://www.opennms.org/), an open source freeware alternative
to
> > > tools such as HP-OV, but as far as I can tell, it's not really ready
for
> > > primetime yet.
> > >
> > > So, like most, I use a combination of tools to give me the big
picture.
> >
> > As
> >
> > > another person pointed out, it's a pain in the neck to have to
configure
> > > several pieces of software to send multiple queries just to get data
on
> >
> > one
> >
> > > element -- you usually end up cobbling together a series of management
> > > scripts to tie it all together.  But as the two tasks (checking state
and
> > > measuring performance) are fundamentally two different tasks, albeit
very
> > > closely related, I prefer to work with tools optimized to perform
either
> >
> > one
> >
> > > or the other.
> > >
> > > Mon is an excellent monitoring tool, as it was designed to be;  that
> >
> > doesn't
> >
> > > necessarily make it an excellent tool for measuring performance,
however.
> > > I'd prefer that core development of mon continue in the direction of
> > > monitoring state and sending notifications, and leave the task of
> >
> > graphing,
> >
> > > reports, etc. to other tools.
> > >
> > > my two cents...
> > >
> > > Scott Prater
> > > Dpto. Sistemas
> > > [EMAIL PROTECTED]
> > >
> > > SERVICOM 2000
> > > Av. Primado Reig, 189 entlo.
> > > 46020 Valencia - Spain
> > > Tel. (+34) 96 332 12 00
> > > Fax. (+34) 96 332 12 01
> > > www.servicom2000.com
> > >
> > >
> > >
> > >
> > > -----Mensaje original-----
> > > De: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]En
> > > nombre de Dick de Waal
> > > Enviado el: jueves, 27 de septiembre de 2001 22:59
> > > Para: [EMAIL PROTECTED]
> > > Asunto: New idea / database question....
> > >
> > >
> > > Hello All!
> > > Did anybody put the monitoring (performance) data from divers
monitoring
> > > scripts
> > > into a database (like postgresql) for further analysing/reporting??
> > > Up- and downtimes are not enough for me, and want to store also the
SNMP
> > > (cpu, disk, process, etc),
> > > HTTP respons, etc data
> > > in this database to create service level reports... or even, because
the
> > > data is then realtime available, do some
> > > pro-active SLA monitoring!!!
> > > (retrieving is somehow already done for the monitoring part.... and
after
> > > comparing in the monitoring scripts,
> > > this data is trown away...but is just usefull...!!)
> > >
> > > Has anybody some idea's/scripts or wanna have a
> >
> > beta/alpha/stable-tester????
> >
> > > I'm now using the latest version and _of course_ running well again!!
> > > Even for my test setup on a Sony Vaio laptop....
> > >
> > > greeetzz
> > > dick
> > > <>
> > > (not realy a perl programmer..........)
>

Re: New idea / database question....

Reply via email to