Hi Chris, Daniel, I write about a number of the configuration decisions we made in order to achieve our current level of performance on my blog:
http://www.semintelligent.com/blog/?q=Nagios Please note that a number of configuration steps we have done go against what the Nagios documentation recommends, so if you wish to do anything similar to what we have done, make sure you understand the Nagios documentation and understand the risks of violating the recommendations in it. We have done a lot of custom development to help make implementing SNMP-based checks across a large number of hosts easier for us: 1) We develop agent-specific checks (we currently use Net-SNMP and SysEdge, starting to do Cisco monitoring) using perl that run clean under ePN. These groups of checks are associated with host groups specific to each agent type (e.g. net-snmp-host). 2) We create a custom base template for each agent type. The template has custom attributess that associate SNMP version, community string etc with the host template. We also use custom attributes in each agent-specific check (e.g. CPU), so that all thresholds are defined at the host level and we can provide default thresholds. For example define host { name net_snmp_host hostgroups +net_snmp_hosts __snmp_version 2c __snmp_community myreadonlycommunity __snmp_port 161 __snmp_version 2c __snmp_storage_partitions all __snmp_storage_warn 90 __snmp_storage_crit 95 __snmp_la_warn 15:10:5 __snmp_la_crit 30:20:10 __snmp_mem_warn free,lt,8 __snmp_mem_crit free,lt,5 __snmp_swap_warn 50 __snmp_swap_crit 65 __snmp_cpu_warn wait,gt,20 __snmp_cpu_crit wait,gt,30 ... register 0 } for custom communities we create separate templates, e.g. define host { name southwest-region-host hostgroups +southwest-hosts __snmp_community southWestRegionCommunity } so now our end users can easily tell Nagios to poll their hosts with SNMP and they can override our thresholds if they want at the host level without having to know a thing about programming: define host { use generic-host, net_snmp_host, southwest-region-host # Override CPU default thresholds __snmp_cpu_warn wait,gt,40 ... } 3) We have developed, and hope to release sometime this year, a perl-based, ePN friendly SNMP check script that handles counters and gauges well, it lets you check multple SNMP OIDs at once. This has been extremely useful for custom SNMP application agents .. a service definition ends up looking like this: define service { use check_snmp_oids-base service_description Custom App - 5 minute SNMP checks __snmp_oids_spec -O 'TimeMin:g:1.3.6.1.4.1.1900.5.5.2.2.1.0' \ -O 'labelFor1sttOid:g:1.3.6.1.4.1.9999.1.3.0' \ -O 'labelFor2ndOid:g:1.3.6.1.4.1.9999.1.4.0' \ -O 'labelFor3rdtOid:g:1.3.6.1.4.1.9999.1.5.0'\ __snmp_oids_crit_spec labelFor1stoid,lt,0 hostgroup_name custom-agent-group servicegroups custom-service-group } In some cases we check 15-20 OIDs at once using this methodology. Our script uses memcached to cache counter data to get delta output properly and we have code that adjusts data properly for over samples, under samples, and large deltas. Many of our checks are based off of the code I wrote that can be downloaded here: http://www.nagios3book.com/nagios-3-enm/checks/ Though we have significantly enhanced things. So, a lot of development time up front but the end result is we get terrific performance and a lot of flexibility. We are using Nagios to replace $$$ COTS products, so our company is happy to have us spend time doing custom development. I realize many of you do not have that luxury so I understand that this won't be ideal for many of you. sorry. Development time with two people to get to where we are now - about 3-4 months. We have permission to release a lot of the code we have done, just need time to package it properly for a public release .. so hopefully we can share some of our tools and help others do something similar without the 3-4 months development time :p. hope this helps more than it confuses. - Max ------------------------------------------------------------------------------ This SF.net email is sponsored by: High Quality Requirements in a Collaborative Environment. Download a free trial of Rational Requirements Composer Now! http://p.sf.net/sfu/www-ibm-com _______________________________________________ Nagios-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
