Dirk, 
 
There are a couple examples of the problem: 
1. Servers Alive hangs (hasn't happened in a while, thankfully) 
2. The machine Servers Alive is running on loses its network connection.  All 
of a sudden, you have hundreds of checks failing, and after a server reboot, 
everything is normal. 
 
As far as the example goes, yes, we would want to see each instance of an 
outage.  This report is used in our weekly team meeting where we discuss 
problems from the past week.  The report is called up during the meeting.  The 
report we have now includes records where the laststatuschange column shows a 
value within the last 7 days of the time the report was run. 
 
We log the entries to an Oracle database and have 2 views setup to facilitate 
the reporting. 
 
This view shows entries that have gone down and come back up in the last week: 
 
DROP VIEW APPDEV.SA_STATS_V; 
 
/* Formatted on 2008/08/29 09:05 (Formatter Plus v4.8.8) */
CREATE OR REPLACE FORCE VIEW appdev.sa_stats_v (HOST, downat, upat)
AS
   SELECT HOST,
          TO_DATE (previousstatuschange, 'MM/DD/YYYY hh:mi:ss AM') AS downat,
          TO_DATE (laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') AS upat
     FROM sa_stats_t s, sa_codes_t s1, sa_codes_t s2
    WHERE TO_DATE (s.laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') > SYSDATE - 7
      AND s.status = s1.status
      AND s.previousstatus = s2.status
--and s1.description not in('Maintenance', 'Unavailable')
      AND s2.description NOT IN ('Maintenance', 'Unavailable')
-- Start permanently don't want entries       AND s.HOST NOT IN
             ('Good Morning', 'CalgServer4 Server', 'ServerA.agrium.com',
              'ServerB.agrium.com', 'ServerC.agrium.com', 'ServerD.agrium.com') 
-- End permanently don't want entries
-- Start temporary don't want entries
      AND s.HOST NOT IN
             ('CalgServer2 LIMSCP Status Monitor Service',
              'CalgServer2 LIMSRP Event Monitor Service',
              'CalgServer2 LIMSCP Event Monitor Service',
              'CalgServer2 LIMSRP Status Monitor Service',
              'CalgServer2 Redwater LIMS Prod Database')
-- End temporary don't want entries
      AND s1.description = 'Up'
   UNION
   SELECT   s.HOST, h.laststatuschange AS downat,
            TO_DATE (NULL, 'MM/DD/YYYY hh:mi:ss AM') AS upat
       FROM sa_stats_t s,
            (SELECT   HOST,
                      MAX
                         (TO_DATE (laststatuschange, 'MM/DD/YYYY hh:mi:ss AM')
                         ) AS laststatuschange
                 FROM sa_stats_t
--  where to_date(laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') > sysdate -30
             GROUP BY HOST) h
      WHERE s.status = 1
        AND TO_DATE (s.laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') =
                                                            h.laststatuschange
-- Start permanently don't want entries 
      AND s.HOST NOT IN
             ('Good Morning', 'CalgServer4 Server', 'ServerA.agrium.com',
              'ServerB.agrium.com', 'ServerC.agrium.com', 'ServerD.agrium.com') 
-- End permanently don't want entries
-- Start temporary don't want entries
      AND s.HOST NOT IN
             ('CalgServer2 LIMSCP Status Monitor Service',
              'CalgServer2 LIMSRP Event Monitor Service',
              'CalgServer2 LIMSCP Event Monitor Service',
              'CalgServer2 LIMSRP Status Monitor Service',
              'CalgServer2 Redwater LIMS Prod Database')
-- End temporary don't want entries
        AND s.HOST = h.HOST
   GROUP BY s.HOST, h.laststatuschange; 
 
 
Here is the other view, which shows entries currently down: 
DROP VIEW APPDEV.SA_STATS_CURRENTLY_DOWN_V; 
 
/* Formatted on 2008/08/29 09:09 (Formatter Plus v4.8.8) */
CREATE OR REPLACE FORCE VIEW appdev.sa_stats_currently_down_v (HOST, downat)
AS
   SELECT   s.HOST, h.laststatuschange AS downat
       FROM sa_stats_t s,
            (SELECT   HOST,
                      MAX
                         (TO_DATE (laststatuschange, 'MM/DD/YYYY hh:mi:ss AM')
                         ) AS laststatuschange
                 FROM sa_stats_t
--  where to_date(laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') > sysdate -30
             GROUP BY HOST) h
      WHERE s.status = 1
        AND TO_DATE (s.laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') =
                                                            h.laststatuschange
-- Start permanently don't want entries       AND s.HOST NOT IN
             ('Good Morning', 'CalgServer4 Server', 'ServerA.agrium.com',
              'ServerB.agrium.com', 'ServerC.agrium.com', 'ServerD.agrium.com') 
-- End permanently don't want entries
-- Start temporary don't want entries
      AND s.HOST NOT IN
             ('CalgServer2 LIMSCP Status Monitor Service',
              'CalgServer2 LIMSRP Event Monitor Service',
              'CalgServer2 LIMSCP Event Monitor Service',
              'CalgServer2 LIMSRP Status Monitor Service',
              'CalgServer2 Redwater LIMS Prod Database')
-- End temporary don't want entries
        AND s.HOST = h.HOST
   GROUP BY s.HOST, h.laststatuschange
   ORDER BY h.laststatuschange;

I hope this helps, 
 
Brett Hanson

>>> "Dirk" <[EMAIL PROTECTED]> 8/29/2008 8:50 AM >>>



If SA isn't running that it can detect that change.not much we can do about 
that. 

Would this also mean that you would want (example): 

 

Check Name     Down At               Up At                    Outage Duration 

LDAP Server     27-Aug 4:06 PM    27-Aug 4:32 PM     26 Minutes 

LDAP Server     27-Aug 7:06 PM    27-Aug 7:30 PM     24 Minutes 

LDAP Server     28-Aug 4:06 AM    28-Aug 7:30 PM     204 Minutes 

 

is the entry went down several times in the last week? 

 

 

Also for the week starting on Aug 24th (at midnight), how would you handle the 
fact that the entry went down on the 23rd at 11:58pm?  Should it mark this down 
at 23rd 11:58pm or starting 24th at midnight? 

 

 

Dirk Bulinckx. 

From: Servers Alive Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of 
Brett Hanson
Sent: Friday, August 29, 2008 4:41 PM
To: Servers Alive Discussion List
Subject: Re: [SA-list] Servers Alive and reporting 

 

One report we've been struggling with for the last couple years is an activity 
report.  We'd like to see a report that shows outages detected by Servers Alive 
in the past 7 days.  This report would show the check name, when the check went 
down, when it came back up and the duration.  Any checks down when the report 
is run would show as 'Currently Down'.  


 


For example: 


Check Name     Down At               Up At                    Outage Duration 


LDAP Server     27-Aug 4:06 PM    27-Aug 4:32 PM     26 Minutes 


Service A         29-Aug 3:10 AM    Currently Down      5 Hours 17 Minutes 


 


Our biggest issue is problems with accuracy that result when Servers Alive is 
restarted - a check that was down before Servers Alive was shut down and was up 
when Servers Alive started again is not detected as a status change, and no 
database record showing the transition exists. 


 


Regards, 


 


Brett Hanson 


Systems Analyst 


Agrium

>>> "Dirk" <[EMAIL PROTECTED]> 8/29/2008 3:50 AM >>>
One of the often returning question on Servers Alive is if it can do reporting.
We always point to the HTML template based output and to the DB logging (using a
3rd party report writer).  Still it seems that this is not what people are
looking for.


That's why we would like you to help us with some brainstorming around that
reporting feature.

I'll start by giving my own idea on it.
* it's based on the HTML template based output
* it can be set to be executed (generated) once a day and you can select
what "entries" go on it
* you can ofcourse have several output's and for several sets of entries
* additional parameters are needed like
% up cycles
% down cycles
% maintenance cycles
and this per DAY, WEEK, MONTH, YEAR
with "easy" access to the current (day/week/...) and the previous
(day/week/...) and also access to other days/weeks/months/year. 

Example:
<sa_stats_up_week{pervious}%>gives the up% of the previous
week
<sa_stats_up_week082008%>gives the up% of week 8 of 2008
<sa_stats_down_month082008%>gives the up% of month 8 of 2008



All ideas/comments/additions are MORE THEN WELCOME



Dirk Bulinckx.

To unsubscribe send a message with UNSUBSCRIBE in the subject line to 
salive@woodstone.nu
If you use auto-responders (like out-of-the-office messages), make sure that 
they are not sent to the list nor to individual members.  Doing so will cause 
you to be automatically removed from the list. 

 

_____________________________________________________________ 

IMPORTANT NOTICE ! 

This E-Mail transmission and any accompanying attachments may contain 
confidential information intended only for the use of the individual or entity 
named above. Any dissemination, distribution, copying or action taken in 
reliance on the contents of this E-Mail by anyone other than the intended 
recipient is strictly prohibited and is not intended to, in anyway, waive 
privilege or confidentiality. If you have received this E-Mail in error please 
immediately delete it and notify sender at the above E-Mail address. 

Agrium uses state of the art anti-virus technology on all incoming and outgoing 
E-Mail. We encourage and promote the use of safe E-Mail management practices 
and recommend you check this, and all other E-Mail and attachments you receive 
for the presence of viruses. The sender and Agrium accept no liability for any 
damage caused by a virus or otherwise by the transmittal of this E-Mail. 

_____________________________________________________________ 



To unsubscribe send a message with UNSUBSCRIBE in the subject line to 
salive@woodstone.nu
If you use auto-responders (like out-of-the-office messages), make sure that 
they are not sent to the list nor to individual members. Doing so will cause 
you to be automatically removed from the list.

To unsubscribe send a message with UNSUBSCRIBE in the subject line to 
salive@woodstone.nu
If you use auto-responders (like out-of-the-office messages), make sure that 
they are not sent to the list nor to individual members. Doing so will cause 
you to be automatically removed from the list. 


_____________________________________________________________  

IMPORTANT NOTICE !  

This E-Mail transmission and any accompanying attachments may contain 
confidential information intended only for the use of the individual or entity 
named above. Any dissemination, distribution, copying or action taken in 
reliance on the contents of this E-Mail by anyone other than the intended 
recipient is strictly prohibited and is not intended to, in anyway, waive 
privilege or confidentiality. If you have received this E-Mail in error please 
immediately delete it and notify sender at the above E-Mail address.  

Agrium uses state of the art anti-virus technology on all incoming and outgoing 
E-Mail. We encourage and promote the use of safe E-Mail management practices 
and recommend you check this, and all other E-Mail and attachments you receive 
for the presence of viruses. The sender and Agrium accept no liability for any 
damage caused by a virus or otherwise by the transmittal of this E-Mail.  

_____________________________________________________________ 

To unsubscribe send a message with UNSUBSCRIBE in the subject line to 
salive@woodstone.nu
If you use auto-responders (like out-of-the-office messages), make sure that 
they are not sent to the list nor to individual members. Doing so will cause 
you to be automatically removed from the list.

Reply via email to