Looks like the smokeping cgi times out reading data.
Is this box I/O bound?
What does top show when you try to get a web-page from SP? [load averages in particular]
In any case, you need to figure out why the CGI is failing to read the data in the allowed time of 40 secs.
Changing the default time-out might help if the box is I/O bound, but not totally buried. [And I'm not sure where that might be.]
However, if the box is seriously overloaded I/O wise, then waiting longer won't really solve your problem - it will just push the box further below the water.
[And this all gets back to - how many RRD's and how big are they. See the database section. Are there slaves? If so, how many?]
Finally:
>Is fping being ran as soon as the cgi script is executed from the webserver?
You appear to misunderstand how SP works. The daemon runs fping and logs the results and writes to the RRD's. The CGI pulls data from the RRD and generates graphs for the http output.
It appears from the debug log from SP that writing the data went fine. [At least for the small subset of targets.]
However reading the RRD's and generating the graphs appears to fail/timeout when reading the RRD's. [Or reading something - in any case.]
Is selinux or apparmour running? If so, then stop them or run in permissive mode and see if that helps.
-Greg
Forgot to add the smoke.log: http://pastebin.com/20UbvJVx At the bottom of the log you can see that I also tried timing fping (the same command that smokeping was running) and it looks like it took 19.3 seconds to run for a small number of machines. Would that cause it to time out? Is fping being ran as soon as the cgi script is executed from the webserver? On Tue, Mar 4, 2014 at 4:10 PM, Brett Bronson <brett.bron...@bigblockla.com> wrote: Here is the apache error log that is listing smokeping: http://pastebin.com/Knm1Cmw1 As for debug mode, here's my output: http://pastebin.com/8txnhnkv The host names do resolve; here's an example: [04:07 PM]superuser@pipeline[/opt/smokeping/bin] > time fping larender001a larender001a is alive real 0m0.014s user 0m0.000s sys 0m0.000s On Tue, Mar 4, 2014 at 3:32 PM, Brett Bronson <brett.bron...@bigblockla.com> wrote: Also, it looks like the version I have running is actually the latest, I assumed it would output the version as 2.6.9. Sorry On Tue, Mar 4, 2014 at 3:29 PM, Brett Bronson <brett.bron...@bigblockla.com> wrote: Okay, it looks like I was actually using an older version of smokeping. I've removed it and installed the latest version on the site and my config is as follows: http://pastebin.com/ZsLE8uCp Before, I was able to get smokeping to work fine up until I added the section: + nodes menu = Render Node Latency title = Render Node Latency (ICMP Pings) ++ larender001a host = larender001a ++ larender001b host = larender001b ++ larender001c host = larender001c ++ larender001d host = larender001d ++ larender002a host = larender002a ++ larender002b host = larender002b ++ larender002c host = larender002c ++ larender002d host = larender002d Now that I look at the logs, it looks like it's still using the old version.... [ ... ] Tue Mar 4 15:03:05 2014 - FPing: probing 5 targets with step 300 s and offset 116 s. Tue Mar 4 15:16:01 2014 - Smokeping version 2.006009 successfully launched. Tue Mar 4 15:16:01 2014 - Not entering multiprocess mode for just a single probe. Tue Mar 4 15:16:01 2014 - FPing: probing 13 targets with step 300 s and offset 163 s. Tue Mar 4 15:25:59 2014 - Smokeping version 2.006009 successfully launched. Tue Mar 4 15:25:59 2014 - Not entering multiprocess mode for just a single probe. Tue Mar 4 15:25:59 2014 - FPing: probing 13 targets with step 300 s and offset 159 s. Before, I used sudo apt-get install smokeping to install, but I later removed it using sudo apt-get remove smokeping; however, it looks like it didn't remove the old version? Any idea how I could resolve this so that it loads up the newer version? On Tue, Mar 4, 2014 at 2:28 PM, Gregory Sloop <gr...@sloop.net> wrote: I don't see a database section, so I assume it's somewhere else. [Nothing looks obviously wrong - but that was just a quick glance.] But when you first start SP after adding a bunch of targets, it's going to have to allocate/create the RRD for each of the targets. [Also, are there slaves, because it will create X * 60 new RRD's - where X is how many slave SP instances you have. (In addition to the master RRD's) ] I wouldn't think that would take 10m, but I can't see how much data you're stuffing in each RRD, or if you have slaves, which might help explain it. As to why web-pages won't work, I'm not sure. Have you looked at the apache logs to see what they say? Or run SP in debug mode? [smokeping --debug IIRC] -Greg
-- Brett Bronson Big Block | Pipeline TD http://www.bigblockla.com [m] 805-338-6520 -- Brett Bronson Big Block | Pipeline TD http://www.bigblockla.com [m] 805-338-6520 -- Brett Bronson Big Block | Pipeline TD http://www.bigblockla.com [m] 805-338-6520 -- Brett Bronson Big Block | Pipeline TD http://www.bigblockla.com [m] 805-338-6520 |
--
Gregory Sloop, Principal: Sloop Network & Computer Consulting
Voice: 503.251.0452 x82
EMail: gr...@sloop.net
http://www.sloop.net
---
_______________________________________________ smokeping-users mailing list smokeping-users@lists.oetiker.ch https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users