We also have a few Solaris machines around. We've purchased a SNMP agent
from Empire Technology (www.empiretech.com) which can report various system
performance related parameters, like swap usage, system load, cpu
utilization, number of open file descriptor, number of processes, etc.
The bad news is that their product doesn't support FreeBSD, although it does
support Linux. So we cannot use this tool to monitor the system
performance. Instead, we need something else which can do roughly the same
thing.
Among so many parameters our immediate interests is the following:
* CPU utilization, % used in Kernel space vs % used in user space
* RAM utilization
* SWAP utilization
* Network bandwidth usage
* number of file descriptors used
As ususal, any hints/comments are more than welcomed. Please do mail a copy
of your response to me directly. Thanks!
I've been writing a program to monitor various values dealing with SNMP --
it's not finished, but it works. Basically, you tell it what to watch,
and if the values go outside defined thresholds or certain values are/are
not met, it triggers an "alert" -- mail, paging (both TAP and SNPP), etc.
Right now, it's running under Linux with ucd-snmp, but porting it over to
FreeBSD should be simple -- the errors I'm getting are dumb ones that are
easily fixed. If anybody's interested, let me know -- it's not available
to the general public (I'm sorta embarresed by the code), but the geeks of
the world can get their hands on what I have so far by asking.
mike
(I'll include one of the config files for your browsing and commentary.)
# Machines at SML
#doublewalk {
#name "r2d2_processlist"
#host "r2d2.smlab.com"
#community "Mlx-20L"
#fromoid ".1.3.6.1.4.enterprises.ucdavis.proctable.prentry.prerrorflag"
#tooid ".1.3.6.1.4.enterprises.ucdavis.proctable.prentry.prerrmessage"
#frequency 41
#mode 0
#alert "mike_pager"
#}
#doublewalk {
# name "r2d2_df"
# host "r2d2.smlab.com"
# community "Mlx-20L"
# fromoid ".1.3.6.1.4.enterprises.ucdavis.disktable.dskentry.dskerrorflag"
# tooid ".1.3.6.1.4.enterprises.ucdavis.disktable.dskentry.dskerrormsg"
# frequency 42
# mode 0
# alert "mike_pager"
#}
doublewalk {
name "palpatine_pslist"
host "palpatine.smlab.com"
community "Mlx-20L"
fromoid ".1.3.6.1.4.enterprises.ucdavis.proctable.prentry.prerrorflag"
tooid ".1.3.6.1.4.enterprises.ucdavis.proctable.prentry.prerrmessage"
frequency 41
mode 0
alert "mike_pager"
}
doublewalk {
name "palpatine_df"
host "palpatine.smlab.com"
community "Mlx-20L"
fromoid ".1.3.6.1.4.enterprises.ucdavis.disktable.dskentry.dskerrorflag"
tooid ".1.3.6.1.4.enterprises.ucdavis.disktable.dskentry.dskerrormsg"
frequency 42
mode 0
alert "mike_pager"
}
doublewalk {
name "watto_pslist"
host "watto.smlab.com"
community "Mlx-20L"
fromoid ".1.3.6.1.4.enterprises.ucdavis.proctable.prentry.prerrorflag"
tooid ".1.3.6.1.4.enterprises.ucdavis.proctable.prentry.prerrmessage"
frequency 41
mode 0
alert "mike_pager"
}
doublewalk {
name "watto_df"
host "watto.smlab.com"
community "Mlx-20L"
fromoid ".1.3.6.1.4.enterprises.ucdavis.disktable.dskentry.dskerrorflag"
tooid ".1.3.6.1.4.enterprises.ucdavis.disktable.dskentry.dskerrormsg"
frequency 42
mode 0
alert "mike_pager"
}
# check owen's transmit.LOCK lockfile for the reporting/paging system to make
# sure it's not too long...
reportchain {
name "owen_txlockfile"
host "owenpub.smlab.com"
community "Mlx-20L"
oidroot ".1.3.6.1.4.enterprises.ucdavis.50.101"
frequency 60
alert "mike_pager"
}
doublewalk {
name "owen_processload"
host "owenpub.smlab.com"
community "Mlx-20L"
fromoid ".1.3.6.1.4.enterprises.ucdavis.loadtable.laentry.laerrorflag"
tooid ".1.3.6.1.4.enterprises.ucdavis.loadtable.laentry.laerrmessage"
frequency 51
mode 0
alert "mike_pager"
}
doublewalk {
name "owen_df"
host "owenpub.smlab.com"
community "Mlx-20L"
fromoid ".1.3.6.1.4.enterprises.ucdavis.disktable.dskentry.dskerrorflag"
tooid ".1.3.6.1.4.enterprises.ucdavis.disktable.dskentry.dskerrormsg"
frequency 52
mode 0
alert "mike_pager"
alert "mike_mail"
}
doublewalk {
name "owen_processlist"
host "owenpub.smlab.com"
community "Mlx-20L"
fromoid ".1.3.6.1.4.enterprises.ucdavis.proctable.prentry.prerrorflag"
tooid ".1.3.6.1.4.enterprises.ucdavis.proctable.prentry.prerrmessage"
frequency 53
mode 0
alert "mike_pager"
}
doublewalk {
name "tarkin_processload"
host "www.smlab.com"
community "Mlx-20L"
fromoid ".1.3.6.1.4.enterprises.ucdavis.loadtable.laentry.laerrorflag"
tooid