Thanks for this info Bill.
Not exactly what I was looking for. For example, in case of HTTP
probes, according to the logic below, I'll get alerted the second IM
detects that the content does not match. I was interested in IM double
checking and triple checking (or w/ever the number is) that that's in
fact true before alerting and doing so before the next poll cycle. In
other words, immediately on detecting a condition that is different
from the current one.
A good example is getting a 500 error instead of 200 OK response on a
web server late at night, when a DB maintenance job runs and
occasionally the DB can't respond to the web server on time. This
error 99% of the time clears out on next run of the probe, so alerting
on it at 3am is not very desirable.
But if in fact it stays like that for 3-4 runs of the probe I'd like
to know even at 3am. Note that 3-4 runs of the probe does not equal 2
min (30 sec poll). If my page is out for 2 min I'm in a bigger
trouble, much bigger then if it throws a 500 error for a fraction of a
second....
Hopefully this makes a bit of sense :)
__________________________________________________________
Andrey Gordon | Integrity Interactive | Network Engineer |
+1.781.398.3518
On Sep 1, 2009, at 3:51 PM, Bob Merrill wrote:
Hi Andrey,
I don't know if these answer your question specifically, but I found
them in the User Guide.
Packet-based Test Procedure
Whenever InterMapper tests a packet-based device, it uses the
following procedure:
1.. InterMapper sends the appropriate probe packet (ping, SNMP get-
request, DNS query, etc.)
2.. InterMapper waits the timeout interval specified for the
particular device.
3.. If a response arrives, InterMapper examines its contents and
sets the device status based on that response
4.. However, if no response arrives, InterMapper sends another probe
packet
5.. The above procedure is repeated until a response arrives or the
specified number of probes has been sent
6.. If no response has arrived after the final timeout, InterMapper
sets the device status to Down.
7.. In any event, the device is scheduled to be tested again at a
time set by the map's (or the device's) poll interval.
The default timeout is three seconds, with a default probe count of
three seconds. Consequently, InterMapper will take nine seconds to
declare a device is down (three probes, waiting three seconds each).
Both the timeout and the number of probes can be set for each device.
This often gives rise to 21 second or 51 second outages. What's
happening here is that the device fails to respond to one set of
probes (for example, after nine seconds), but responds immediately
at the next poll 30 or 60 seconds later. This gives an outage
duration to be (30-9=21) seconds or (60-9=51) seconds.
TCP-based probes
Probes like "HTTP", "SMTP", and "LDAP" and others test the ability
of a server to accept a TCP connection on a specific listening port,
and to respond to a scripted interchange.
a.. For more information on TCP-based probes see Server Probes -
Standard .
What Happens in a TCP-based interchange
1.. InterMapper first attempts to connect to the specified port at
the device's address.
2.. If this connection attempt fails, InterMapper shows the device
in the DOWN state.
If InterMapper successfully connects to the listening port,
InterMapper sends protocol-specific commands through the TCP
connection to test the server's responses and compare them to
expected values.
3.. InterMapper changes the status of the device (e.g. ALARM,
WARNING, OKAY, DOWN) if an error condition is detected, or if the
execution of InterMapper's probe is interrupted for any reason.
4.. If InterMapper doesn't receive a proper response for 60 seconds,
or if the TCP connection is lost while waiting for a response, the
InterMapper probe will set status of the device to the proper
condition.
Bob Merrill (Documentation Guy)
----- Original Message ----- From: "Andrey Gordon" <[email protected]>
To: "InterMapper Discussion" <[email protected]>
Sent: Tuesday, September 01, 2009 3:16 PM
Subject: [IM-Talk] False positives protection
You can probably tell I got back to tweaking IM again.....
Anyway, I was curious if there is a mechanism that is built in
into IM server to double check the failures. For example, if a
probe detects a condition that normally triggers an alert (let's
say Alarm). Does InterMapper double check that the condition in
fact exists or it was just a blip, glitch, etc before alerting?
I'd be interested to see a feature that allows control over that
kind of behavior. Let's say, just like under Notifiers one can
control the number of repeating alerts, it would be nice to
control there number of times the condition has to be triggered
before alerts goes out.
The reason this is important is so we could make sure that the
condition is there before sending an alert to a pager at 3am. I'd
like to be able to setup alerts going to email after one
occurrence and to pager after, let's say 3. Many times, but the
time I crawl out of bed to see what it is, it already cleared.
Setting it up with delayed notification kind of does the job, but
not really. All it does is delays the notification for a minimum of
a minute, which with default probe timers only gives it an extra
one time to probe it.
I'd like to see Intermapper after detecting a condition being able
to verify it a controllable number of times before sending an alert
to a give notifier.
Consider this a feature request, but I'd still want to understand
what the logic of this behavior is today as implemented. I'm
suspecting many on this list would be interested to hear how IM
behaves on condition detection....
Thank you,
Andrey
__________________________________________________________
Andrey Gordon | Integrity Interactive | Network Engineer |
+1.781.398.3518
____________________________________________________________________
List archives: http://www.mail-archive.com/intermapper-talk%40list.dartware.com/
To unsubscribe: send email to: [email protected]
____________________________________________________________________
List archives: http://www.mail-archive.com/intermapper-talk%40list.dartware.com/
To unsubscribe: send email to: [email protected]
____________________________________________________________________
List archives:
http://www.mail-archive.com/intermapper-talk%40list.dartware.com/
To unsubscribe: send email to: [email protected]