Hi all, Firstly thanks for this very useful software.
I'm trying to set up smokeping to run pings continuously, or as close as possible to continuous. I see this has been attempted or at least considered before: http://thread.gmane.org/gmane.network.smokeping.user/4202 My motivation is the same as the original poster there - I need to capture every single second where there might be network issues. Therefore I was very surprised to discover that it seems smokeping does not support this use case. I found the --nosleep parameter which is mentioned very briefly in the docs as being "for debugging". Looking at the main while loop in Smokeping.pm it seems that this option eliminates all sleeps, so it's not possible to have certain probes running continuously whilst others sleep as normal. Then I looked at setting the 'step' configuration parameter value to the duration of the probe. The docs describe this parameter as follows: Duration of the base operation interval of SmokePing in seconds. SmokePing will venture out every step seconds to ping your target hosts. Looking at the source code, I see that the intention is that "every step seconds" includes the runtime of the probe. For example, if you have step=60 and pings=10, the probes are launched every 60 seconds, not every 70 seconds. For continuous pinging, one might think that setting step=10 and pings=10 would yield the desired results. However, there are two problems with this. Firstly, the code in question is: my $sleeptime = $step - (time-$offset) % $step; [...] sleep $sleeptime; so if the probe takes even a fraction over 10 seconds, it ends up sleeping for nearly another 10 seconds until the next 10 second boundary. This means the pings are only happening roughly 50% of the time, not continuously. A hack might be to set pings=9, but then the pings are only happening 90% of the time. It's possible to get arbitrarily close to 100% by making both values very high, e.g. step=1000 and pings=999, but then you lose the granularity of RRD results. The second issue is that there is a hardcoded expectation that the probe runtime will be less than 80% of the polling cycle: elsif ($runtime > $step * 0.8) { my $warn = "NOTE: smokeping took $runtime seconds to complete 1 round of polling. ". "This is over 80% of the max time available for a polling cycle ($step seconds).\n"; if (defined $myprobe) { $probes->{$myprobe}->do_log($warn); } else { do_log($warn); } } So if you choose continuous pinging, which seems to me (and presumably the original poster) to be a perfectly reasonable use case, your logfiles get spammed with messages. My workaround for now is as follows: 1. Configure steps=10 and pings=10 2. Comment out the lines causing warnings above 3. Invoke with --nosleep I had a couple of other minor questions: 1. I see mentions of an svn repository on the mailing list, but nothing is published. Where is the latest code available, and what's the standard procedure for submitting patches? 2. Why does --debug exit immediately after the first iteration? What if you want to debug the sleep cycle? Many thanks! Adam _______________________________________________ smokeping-users mailing list [email protected] https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
