Aha, that is what I was looking for. Basically, that is what I wanted
to hear: "it always retries many times at a short interval (shorter
than the poll interval) before reporting an outage."
So with that said, maybe we can have the delay feature to allow wait
time less then 1 minute? I think even 30 second delay should be helpful.
Thank you for the explanation, Richard.
PS. Dartware communications are awesome, I wish every company would
respond like that (fast and detailed) to the customers....
__________________________________________________________
Andrey Gordon | Integrity Interactive | Network Engineer |
+1.781.398.3518
On Sep 1, 2009, at 5:25 PM, Richard E. Brown wrote:
Andrey,
OK. I see that my discussion of delayed notificaions didn't quite
address your question - you're more interested in the polling
algorithm. Bob Merrill's quote from the User Guide gives the
details, but maybe not quite in the terms that you're asking...
In fact, I believe that the current InterMapper behavior already
does what you describe - it always retries many times at a short
interval (shorter than the poll interval) before reporting an outage.
There are two separate cases:
1) In TCP probes (SMTP, HTTP, etc.) InterMapper reports an outage
because it has failed to establish a connection within 60 (the
default) seconds. InterMapper relies on the underlying OS facility
to retransmit the SYN/SYN/ACK packets until it works or it has
failed multiple times.
Consequently, if InterMapper reports an outage of a TCP-based probe,
it has definitively failed. There have been several attempts to
start the connection (using the TCP retry algorithm, with
exponential backoff, etc.) and it's fair to say that at least one
device (InterMapper) has already failed to connect. There's no need
to retry again.
2) For UDP probes (Ping, SNMP, etc.) InterMapper sends N requests
(default is three attempts) spaced M seconds apart (default is three
- both parameters are settable) before declaring the device to be
down.
In this case, too, if InterMapper reports an outage, the device has
failed to respond to a number of short-interval requests, and
there's no additional need to retry.
Having said that... :-)
The subject line of this note is "False Positives protection", and I
imagine that you're seeing alerts from things that aren't really a
problem.
Could you tell me a little more about what kinds of devices have the
problem, what the probe is, the poll interval, timeout, and the
retry count/retry interval if it's a UDP probe? Thanks.
Rich
____________________________________________________________________
List archives:http://www.mail-archive.com/intermapper-talk%40list.dartware.com/
To unsubscribe: send email to: [email protected]
____________________________________________________________________
List archives:
http://www.mail-archive.com/intermapper-talk%40list.dartware.com/
To unsubscribe: send email to: [email protected]