Andrea:

I've not ever used the utility 'Mon', but I've spent a good deal of time
configuring systems with HA (High Availability).  It's not uncommon for
a service to flagged as down on periodic checks.  The solution to false
positives requires some give and take.

For example, when I was writing code to monitor a sybase engine, I would
run a check every X seconds.  If the server failed to respond Y times in
a row, it was assumed down.  For other applications, we might have
polled every J seconds, and required more or less failures.

If there is a way to configure Mon to report a service as down after a
number of failures, then that is my recommendation.  Just because a
service fails a test once doesn't mean that it's down.  I could just be
busy.

David

Andrea Cerrito wrote:
> 
> Hi to all,
> 
> I have a server farm with pop3 / smtp / ftp services running on Linux and
> served by tcpserver. My monitoring software is Mon, and sometimes I'm
> receiving alarms about these services: they are always false alarms.
> 
> For example:
> 
> =======SERVICE IS MARKED AS DOWN==========
> Summary output        : **** Time Out
> 
> Group                 : pop3-a.frontend.int
> Service               : smtp
> Time noticed          : Tue Jun 12 13:27:10 2001
> Secs until next alert :
> Members               : pop3-a.frontend.int
> 
> Detailed text (if any) follows:
> -------------------------------
> pop3-a.frontend.int
> 
> ========SERVICE IS MARKED AS UP==========
> Summary output        : **** Time Out
> 
> Group                 : pop3-a.frontend.int
> Service               : smtp
> Time noticed          : Tue Jun 12 13:28:16 2001
> Secs until next alert :
> Members               : pop3-a.frontend.int
> 
> Detailed text (if any) follows:
> -------------------------------
> pop3-a.frontend.int
> 
> Just one minute (and I'm doing test every minute)... I'm trying to
> understand why I'm having those false alarms on only services running with
> tcpserver on Linux. I mean, if the service is running with tcpserver on
> Solaris or the services is running on linux without tcpserver, I've no
> errors (ie, qmail on solaris and Apache on linux).
> 
> Viewing logs, I've no errors.
> 
> What can be the problem?? What I've to search for??
> 
> Thanks
> 
> PS I didn't find a list about ucspi-tcp: if I wrote to wrong list, please
> tell me which is the correct one :)
> ---
> Cordiali saluti / Best regards
> Andrea Cerrito
> ^^^^^^^^^^^^^^
> Net.Admin @ Centro MultiMediale di Terni S.p.A.
> P.zzale Bosco 3A
> 05100 Terni IT
> Tel. +39 744 5441330
> Fax. +39 744 5441372

Reply via email to