[Nagios-users] plugin for monitoring nagios, remembering to turn notifications back on
I could not think of a way to search for this kind of plugin in the usual places, so I ended up creating a version myself. We had a service and had turned off notifications a bit ago and forgot to turn them back on once the problem was fixed. So, now I have a service which checks the services in the status.dat file and goes red is a service has notification turned off for longer than a day or two. How does one search the exchanges for a plugin which checks nagios itself. Doing a search for "nagios" obviously does not work. Sort of like looking for R in Google. It turns out that there are a lot of pages that use that letter and not all of them are about the statistics tool Anyway. Right now, the service is written and does not come up, probably because something is missing. After all, what host does one attach it to? It is about the nagios service itself, so no host would really be correct. Or I am missing something else. I am sure I can get it to work. I will be finishing this and will put it up in the exchange, but if anyone has suggestions, I would be interested in hearing them. One bother is that if a service has its notifications turned off, none of the time markers in the service say when that happened. If a service went down and up 7 days ago and the notifications were turned off 5 days ago, all the times point to the event 7 days ago. I do not see one which references the change 5 days ago. There seems to be no variable that stores the time a human changed something about a service in the service that was changed So, I have to store the time I first saw a service with its notifications turned off. I was storing this in a file, but I am going to use the output line from this service. Then the status.dat entry for this service will tell me about the notification setting time for the other services. Seems a bother, but there it is. Is there a name for this sort of nagios-monitoring, reflexive plugin? Does anyone know a key word to search for to find others like this? thanx - ray -- The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE: Pinpoint memory and threading errors before they happen. Find and fix more than 250 security defects in the development cycle. Locate bottlenecks in serial and parallel code that limit performance. http://p.sf.net/sfu/intel-dev2devfeb ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] check_http with auth issue
On Jan 20, 2011, at 1:35 PM, stan wrote: > Hi, > We're having problems with Nagios3 (3.2.0) reporting 'HTTP WARNING HTTP/1.1 > 401 Unauthorized' for check_http queries to a several switched CDUs of the > same type. All these CDUs have HTTP Server = enabled, Authentication= Basic. > > Running check_http from the command line returns an OK status: > /usr/lib/nagios/plugins/check_http -I ppcr23ps1 --authorization=user:passwd > HTTP OK HTTP/1.1 200 OK - 763 bytes in 0.267 seconds > We had a similar problem. We have several CNAME entries for different names for some of our hosts. We were using a host definition that included the IP address. Then we switched to a host definition that was using the DNS name only. Also, be careful of certain characters in your password digest. If an ampersand appears in the password digest (and they seem to pop up about 1/3 of the time), you have to protect it with a backslash. We were running the check_http command the same way, but were of course putting single-quotes around the password digest and, gee, there is no problem. These may not be your issue, but they may point you to something else. Good luck. cheers - ray > check_http is reporting OK status for all the other devices we're checking > from this Nagios3 server. > > This Nagios3 server is an internal server in a distributed nagios > configuration, and we have 1 identical CDU configured on the main nagios > server outside the firewall (the external nagios is 3.1.0). The check_http on > the Nagios 3.1.0 server (external) does not give the WARNING 401 Unauthorized > message. It displays HTTP OK HTTP 1.1 200 OK. > > To verify that the CDU config was not the problem or the firmware version > wasn't the problem on the 2 new CDUs being configd for the internal server, > we added the external CDU > (already reporting OK on nagios 3.1.0) to the Nagios3 configuration & did a > check_http. Like the other 2 units, the command line reports HTTP OK but > nagios displays the 401 warning. > > We've run from the command line check_http -v and it's identical on both > servers (everything is OK). > However, the header on the devices that do not get the 401 warning start > with a Server: descrip and Connection: close before the Content-Type > > CDU: > > STATUS: HTTP/1.1 200 OK > HEADER > Content-Type: text/html > (the rest of the header info follows) > > All other devices: > > STATUS: HTTP/1.1 200 OK > HEADER > Server: Embedded Web Server(or whatever server description per device) > Connection: close > Content-Type: text/html > (the rest of the header info follows) > > Any ideas why it works on 3.1.0 and not 3.2.0? > Thanks > > > -- > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? > A: Top-posting. > Q: What is the most annoying thing in e-mail? > > -- > Protect Your Site and Customers from Malware Attacks > Learn about various malware tactics and how to avoid them. Understand > malware threats, the impact they can have on your business, and how you > can protect your company and customers by using code signing. > http://p.sf.net/sfu/oracle-sfdevnl > ___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null > -- Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] tracking AWStats systems?
I know that there are a huge number of web sites that nagios could track in different ways. But it seems to me that AWStats is commonly used and of interest to the people who also use nagios. So, does anyone have a suggestion for this? We have several AWStats systems and we want to check things like, is it still logging, is it seeing only 0 hits for some reason, is traffic dropping below or above some certain level, and are certain types of hits appearing? I suppose a nagios plugin can check the apache logs directly. But we appreciate using AWStats and it would be nice to enable nagios to check if the AWStats systems we have are working. Any thoughts or suggestions? thanx - ray -- Protect Your Site and Customers from Malware Attacks Learn about various malware tactics and how to avoid them. Understand malware threats, the impact they can have on your business, and how you can protect your company and customers by using code signing. http://p.sf.net/sfu/oracle-sfdevnl ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
Re: [Nagios-users] tracing nagios actions
On Dec 20, 2010, at 10:48 AM, mark bradley wrote: > There's always strace(1) if you want to dive into the details ... > > Mark Ah. Unfortunately, I am not Mac OS X and not Linux. On Dec 20, 2010, at 10:28 AM, Mike Chesnut wrote: >> Of course, one cannot tell what command is _actually_ being executed or >> which command _was_ actually executed. I pointed this out in a previous >> post (below). Apparently there are no workarounds for this. > > If I understand what you're asking about, I've used this to achieve it > in the past: > > http://www.waggy.at/nagios/capture_plugin.htm This is interesting. It is unfortunate that one would have to put the plugin onto every command, but it is definitely a possibility. Of course, nagios could itself have this kind of functionality. But that would be up to the implementors. Thanks for the replies, though. cheers - ray > > On Mon, Dec 20, 2010 at 1:18 PM, Ray Kiddy wrote: > > On Dec 20, 2010, at 9:36 AM, Polifemo, Salvatore wrote: > >> Yes, run the actual command from the command line as Steve demonstrated. >> >> Make sure which command is being used, and if you run the command with no >> parameters it will display the correct usage. >> >> Salvatore Polifemo >> Sr. Systems Security Specialist >> ConEdison Solutions >> 100 Summit Lake Drive >> Valhalla, NY 10595 >> > > Of course, one cannot tell what command is _actually_ being executed or which > command _was_ actually executed. I pointed this out in a previous post > (below). Apparently there are no workarounds for this. > > - ray > > Begin forwarded message: > >> From: Ray Kiddy >> Date: November 17, 2010 9:42:59 AM PST >> To: Nagios Users List >> Subject: [Nagios-users] can log show actual command executed? >> Reply-To: Nagios Users List >> >> >> I am having a problem figuring out see what is actually being executed from >> a service. Is there a way to get the nagios log to contain the actual >> command being executed? >> >> This is what I am seeing in the Nagios.log file: >> >> [1290013792] SERVICE ALERT: myhost.com;Special >> App;CRITICAL;SOFT;1;(Service Check Timed Out) >> >> This is what I see in the nagios.dat file: >> >> check_command=check_http!/myURL!alive >> >> So, this shows me what the command string is in the service.cfg. I cannot >> see, though, what the actual command line is at this moment in time. It >> turns out that this check_command corresponds (I think) to: >> >> check_http -u /myURL -s alive >> >> How would I know this, though, if the command definition had been changed or >> if it is using, because of a mis-spelling, a command I do not think it is >> using? If I go into the command.cfg and switch the order of parameters, for >> example, I see nothing in these logs that tells me what is doing what. >> >> I know the simplest answer is "You should not do that." But my point is that >> the log file does not have enough information to tell me what happened at a >> past moment of time. I would need the log information _and_ the state of the >> command definitions at that time. If a log does not show you what happened >> in the past, what is its purpose? >> >> I am having a problem with a particular web application. For some reason I >> put in the check and it fails. I execute the check_http that I _think_ this >> service is doing, and it gives me an OK. I ended up creating a custom >> executable that calls curl and fetches against the same URL and this now >> works fine. Kind of lame, though. I use check_http in about 100 other >> services. So, why is this one single service not working? An obvious answer >> is that I am not calling the command in the way I think I am. But if I look >> in the log to see what the service did, I can see what I _think_ it did >> based on what I can see in what I _think_ is the correct command definition. >> But I really do not know. I do not see a line like "check_http -u /myURL -s >> alive" in the log, so, I cannot see if I am mis-reading things. >> >> Any suggestions? >> >> - ray >> > > > >> From: steve f [mailto:a31mod...@hotmail.com] >> Sent: Monday, December 20, 2010 12:14 PM >> To: nagios-users@lists.sourceforge.net >> Subject: Re: [Nagios-users] tracing nagios actions >> >> Mark, >> >> I think Salvatore means run the check manually from the command line , make >> sure you run it as the nagios user and try setting
Re: [Nagios-users] tracing nagios actions
On Dec 20, 2010, at 9:36 AM, Polifemo, Salvatore wrote: > Yes, run the actual command from the command line as Steve demonstrated. > > Make sure which command is being used, and if you run the command with no > parameters it will display the correct usage. > > Salvatore Polifemo > Sr. Systems Security Specialist > ConEdison Solutions > 100 Summit Lake Drive > Valhalla, NY 10595 > Of course, one cannot tell what command is _actually_ being executed or which command _was_ actually executed. I pointed this out in a previous post (below). Apparently there are no workarounds for this. - ray Begin forwarded message: > From: Ray Kiddy > Date: November 17, 2010 9:42:59 AM PST > To: Nagios Users List > Subject: [Nagios-users] can log show actual command executed? > Reply-To: Nagios Users List > > > I am having a problem figuring out see what is actually being executed from a > service. Is there a way to get the nagios log to contain the actual command > being executed? > > This is what I am seeing in the Nagios.log file: > > [1290013792] SERVICE ALERT: myhost.com;Special > App;CRITICAL;SOFT;1;(Service Check Timed Out) > > This is what I see in the nagios.dat file: > > check_command=check_http!/myURL!alive > > So, this shows me what the command string is in the service.cfg. I cannot > see, though, what the actual command line is at this moment in time. It turns > out that this check_command corresponds (I think) to: > > check_http -u /myURL -s alive > > How would I know this, though, if the command definition had been changed or > if it is using, because of a mis-spelling, a command I do not think it is > using? If I go into the command.cfg and switch the order of parameters, for > example, I see nothing in these logs that tells me what is doing what. > > I know the simplest answer is "You should not do that." But my point is that > the log file does not have enough information to tell me what happened at a > past moment of time. I would need the log information _and_ the state of the > command definitions at that time. If a log does not show you what happened in > the past, what is its purpose? > > I am having a problem with a particular web application. For some reason I > put in the check and it fails. I execute the check_http that I _think_ this > service is doing, and it gives me an OK. I ended up creating a custom > executable that calls curl and fetches against the same URL and this now > works fine. Kind of lame, though. I use check_http in about 100 other > services. So, why is this one single service not working? An obvious answer > is that I am not calling the command in the way I think I am. But if I look > in the log to see what the service did, I can see what I _think_ it did based > on what I can see in what I _think_ is the correct command definition. But I > really do not know. I do not see a line like "check_http -u /myURL -s alive" > in the log, so, I cannot see if I am mis-reading things. > > Any suggestions? > > - ray > > From: steve f [mailto:a31mod...@hotmail.com] > Sent: Monday, December 20, 2010 12:14 PM > To: nagios-users@lists.sourceforge.net > Subject: Re: [Nagios-users] tracing nagios actions > > Mark, > > I think Salvatore means run the check manually from the command line , make > sure you run it as the nagios user and try setting tha warning & critical > values to something that will make it fail also: > > /usr/local/nagios/libexec > ./check_disk -w 50 -c 70 -p /home > DISK OK - free space: /home 440 MB (95% inode=99%);| /home=20MB;436;416;0;486 > > The -p just checks a specific path. ( FYI ) > > Steve > > > > > Date: Mon, 20 Dec 2010 11:50:14 -0500 > From: gopearl...@gmail.com > To: nagios-users@lists.sourceforge.net > Subject: Re: [Nagios-users] tracing nagios actions > > Hi Salvatore, > > They're all Unix (Redhat) servers. By check command do you mean nagios -v? > I've done that and I do not get an errors. > > Thanks, > Mark > > On Mon, Dec 20, 2010 at 11:35 AM, Polifemo, Salvatore > wrote: > Are these Windows or *nix server? > > Either wau run the check command manually from a console and see what the > results are. > > > > Salvatore Polifemo > Sr. Systems Security Specialist > ConEdison Solutions > 100 Summit Lake Drive > Valhalla, NY 10595 > > From: mark bradley [mailto:gopearl...@gmail.com] > Sent: Monday, December 20, 2010 11:14 AM > To: nagios-users@lists.sourceforge.net > Subject: [Nagios-users] tracing nagios actions > > Hi, > > I have a small-ish number of servers
Re: [Nagios-users] combining more than one status.dat file...
On Nov 19, 2010, at 10:08 PM, Rutger Blom wrote: > Hi, > > I would really suggest you have a look at check_mk and multisite. This > Nagios addon supports having multiple Nagios servers in one management > interface. > http://mathias-kettner.de/checkmk_multisite.html > > /Rutger > This seems to be a larger solution than I am seeking. I am still thinking a ten-line script that grabs some status.dat files, matches on a regex to rename services and combines the files will work. Any reason why not? - ray > On Saturday, November 20, 2010, Ray Kiddy wrote: >> >> I have an idea on how to get a sort of distributed nagios to work. It >> actually seems simple. MaybeI am not seeing something. Maybe someone here >> knows a reason this will not work. >> >> I have a nagios server here. It is watching some things in about a dozen >> data centers around the world. There is a nagios running in China and >> another one running in Ankara, as well. >> >> I want to see, in one interface, how all three of these nagios servers see >> things. I think I should be able to take the status.dat here, take the >> status.dat from Ankara (with a "_tr" added to the service names), take the >> status.dat from China (with a "_cn" added to the names of the services), and >> put these all three together into the one status.dat file here. I figure I >> have to do it in such a way that I do not get into a race condition with the >> nagios server itself, but other than that, what is the problem with this? >> >> I had asked for this kind of thing before. With our nagios server, we can >> see if the cn servers are up, but we care a lot more about whether the cn >> servers are up for the cn customers. It does not matter how the CN-US link >> is behaving. So, if the cn servers look to be down, but Ankara can see them, >> we know there is no problem. >> >> This just seems to be a really simple way to get this. But, given the >> complexities of some of discussions about this stuff, I am dubious. If it >> was this simple, it would be documented, and even talked about, no? I am >> sure I am not the only one who wants this. Any thoughts? >> >> thanx - ray >> >> >> -- >> Beautiful is writing same markup. Internet Explorer 9 supports >> standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. >> Spend less time writing and rewriting code and more time creating great >> experiences on the web. Be a part of the beta today >> http://p.sf.net/sfu/msIE9-sfdev2dev >> ___ >> Nagios-users mailing list >> Nagios-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/nagios-users >> ::: Please include Nagios version, plugin version (-v) and OS when reporting >> any issue. >> ::: Messages without supporting info will risk being sent to /dev/null >> > > -- > Rutger Blom > Luzernvägen 14 > 227 38 LUND > Sweden > Tel. +46 763 46 99 44 > www.rutgerblom.com > about.me/rutgerblom > > -- > Beautiful is writing same markup. Internet Explorer 9 supports > standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. > Spend less time writing and rewriting code and more time creating great > experiences on the web. Be a part of the beta today > http://p.sf.net/sfu/msIE9-sfdev2dev > ___ > Nagios-users mailing list > Nagios-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nagios-users > ::: Please include Nagios version, plugin version (-v) and OS when reporting > any issue. > ::: Messages without supporting info will risk being sent to /dev/null -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] combining more than one status.dat file...
I have an idea on how to get a sort of distributed nagios to work. It actually seems simple. MaybeI am not seeing something. Maybe someone here knows a reason this will not work. I have a nagios server here. It is watching some things in about a dozen data centers around the world. There is a nagios running in China and another one running in Ankara, as well. I want to see, in one interface, how all three of these nagios servers see things. I think I should be able to take the status.dat here, take the status.dat from Ankara (with a "_tr" added to the service names), take the status.dat from China (with a "_cn" added to the names of the services), and put these all three together into the one status.dat file here. I figure I have to do it in such a way that I do not get into a race condition with the nagios server itself, but other than that, what is the problem with this? I had asked for this kind of thing before. With our nagios server, we can see if the cn servers are up, but we care a lot more about whether the cn servers are up for the cn customers. It does not matter how the CN-US link is behaving. So, if the cn servers look to be down, but Ankara can see them, we know there is no problem. This just seems to be a really simple way to get this. But, given the complexities of some of discussions about this stuff, I am dubious. If it was this simple, it would be documented, and even talked about, no? I am sure I am not the only one who wants this. Any thoughts? thanx - ray -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] can log show actual command executed?
I am having a problem figuring out see what is actually being executed from a service. Is there a way to get the nagios log to contain the actual command being executed? This is what I am seeing in the Nagios.log file: [1290013792] SERVICE ALERT: myhost.com;Special App;CRITICAL;SOFT;1;(Service Check Timed Out) This is what I see in the nagios.dat file: check_command=check_http!/myURL!alive So, this shows me what the command string is in the service.cfg. I cannot see, though, what the actual command line is at this moment in time. It turns out that this check_command corresponds (I think) to: check_http -u /myURL -s alive How would I know this, though, if the command definition had been changed or if it is using, because of a mis-spelling, a command I do not think it is using? If I go into the command.cfg and switch the order of parameters, for example, I see nothing in these logs that tells me what is doing what. I know the simplest answer is "You should not do that." But my point is that the log file does not have enough information to tell me what happened at a past moment of time. I would need the log information _and_ the state of the command definitions at that time. If a log does not show you what happened in the past, what is its purpose? I am having a problem with a particular web application. For some reason I put in the check and it fails. I execute the check_http that I _think_ this service is doing, and it gives me an OK. I ended up creating a custom executable that calls curl and fetches against the same URL and this now works fine. Kind of lame, though. I use check_http in about 100 other services. So, why is this one single service not working? An obvious answer is that I am not calling the command in the way I think I am. But if I look in the log to see what the service did, I can see what I _think_ it did based on what I can see in what I _think_ is the correct command definition. But I really do not know. I do not see a line like "check_http -u /myURL -s alive" in the log, so, I cannot see if I am mis-reading things. Any suggestions? - ray -- Beautiful is writing same markup. Internet Explorer 9 supports standards for HTML5, CSS3, SVG 1.1, ECMAScript5, and DOM L2 & L3. Spend less time writing and rewriting code and more time creating great experiences on the web. Be a part of the beta today http://p.sf.net/sfu/msIE9-sfdev2dev ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null
[Nagios-users] recognizing string from HTTP checks?
Sorry if this is a newbie question. I am running Nagios 3.0.6 on a Centos Linux system. It is working, but I am having a difficulty in getting it to recognize response from a HTTP service request. I have a bunch of domains served by my ISP. I changed the hosting location for them. I want to confirm that these domains are pointing to my host, where I have a page that just says "For any questions about this domain, please contact r...@ganymede.org." If the domain is being served by my ISP, it says things like "Drop 25 Pounds in 30 days" and "Hotwire Hotels". I do not want this. So I put this in my objects/localhost.cfg: define host{ use linux-server host_name www.domain1.com address www. domain1.com } define service{ use local-service host_name www. domain1.com service_description HTTP check_command check_http!-s "For any questions about this domain" notifications_enabled 0 } I have two domains that demonstrate the problem. The "www.domain1.com" is being served correctly and "www.domain2.com" is not. In the "status.cgi?host=all" page, I see: HTTP www.domain1.com OK 07-17-2009 16:31:30 6d 21h 33m 55s 1/4 HTTP OK HTTP/1.1 200 OK - 491 bytes in 0.788 seconds HTTP www.domain2.com OK 07-17-2009 16:31:09 2d 17h 14m 16s 1/4 HTTP OK - HTTP/1.1 302 Moved Temporarily - 0.416 second response time Why is this not recognizing that "www.domain2.com" is not returning the string I want it to return? Even if the domain is being redirected, it is being redirected wrong. At no point is that URL returning the string I want it to see. So, "www.domain2.com" should be red. But it is green. Any suggestions on how to make this work? thanx - ray -- Enter the BlackBerry Developer Challenge This is your chance to win up to $100,000 in prizes! For a limited time, vendors submitting new applications to BlackBerry App World(TM) will have the opportunity to enter the BlackBerry Developer Challenge. See full prize details at: http://p.sf.net/sfu/Challenge ___ Nagios-users mailing list Nagios-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nagios-users ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null