Problem: Smokeping is stoping to collect information via RemoteFPing. Description: After an several hours of work smokeping is stoping to collect information via RemoteFPing probe. I'm discovered an one _hanged_ RemoteFPing job in running processes in that moments. And that job is _never_ finishing up.
I have some packet loss between smokeping_host and remote_fping_host (it's a radiolink). I think that the reason of hanging up is that packet loss. Then i'm trying to manually kill that hanged remotefping job - smokeping is resuming and continuing to collect information from others probes without problems. Some debug information: I'm using it on slackware and debian linux with the latest rrdtool/libs/etc compiled manually from the sources. # rrdtool fetch myobject.rrd AVERAGE -s -1h 1083246300: nan 0.0000000000e+00 3.3827010000e-02 2.5839726667e-02 2.8640310000e-02 2.8944346667e-02 2.9458430000e-02 3.0529003333e-02 3.3827010000e-02 .5195360000e-02 3.5769113333e-02 8.4483206667e-02 1.4752046000e-01 1083246600: nan nan nan nan nan nan nan nan nan nan nan nan nan 1083246900: nan nan nan nan nan nan nan nan nan nan nan nan nan 1083247200: nan nan nan nan nan nan nan nan nan nan nan nan nan That message appears in the logs, then i'm trying to manually kill hanged job: > WARNING: smokeping took 1221 seconds to complete 1 round of polling. > It should complete polling in 300 seconds. You may have unresponsive devices in your setup. Is it possible to resolve that problem in the feature realeses? Maybe it's reasonable to add some timeout function in smokeping, to prevent it from waiting for an hanged jobs completing? I'm sorry for my pure english. -- Unsubscribe mailto:[EMAIL PROTECTED] Help mailto:[EMAIL PROTECTED] Archive http://www.ee.ethz.ch/~slist/smokeping-users WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
