Dirk, There are a couple examples of the problem: 1. Servers Alive hangs (hasn't happened in a while, thankfully) 2. The machine Servers Alive is running on loses its network connection. All of a sudden, you have hundreds of checks failing, and after a server reboot, everything is normal. As far as the example goes, yes, we would want to see each instance of an outage. This report is used in our weekly team meeting where we discuss problems from the past week. The report is called up during the meeting. The report we have now includes records where the laststatuschange column shows a value within the last 7 days of the time the report was run. We log the entries to an Oracle database and have 2 views setup to facilitate the reporting. This view shows entries that have gone down and come back up in the last week: DROP VIEW APPDEV.SA_STATS_V; /* Formatted on 2008/08/29 09:05 (Formatter Plus v4.8.8) */ CREATE OR REPLACE FORCE VIEW appdev.sa_stats_v (HOST, downat, upat) AS SELECT HOST, TO_DATE (previousstatuschange, 'MM/DD/YYYY hh:mi:ss AM') AS downat, TO_DATE (laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') AS upat FROM sa_stats_t s, sa_codes_t s1, sa_codes_t s2 WHERE TO_DATE (s.laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') > SYSDATE - 7 AND s.status = s1.status AND s.previousstatus = s2.status --and s1.description not in('Maintenance', 'Unavailable') AND s2.description NOT IN ('Maintenance', 'Unavailable') -- Start permanently don't want entries AND s.HOST NOT IN ('Good Morning', 'CalgServer4 Server', 'ServerA.agrium.com', 'ServerB.agrium.com', 'ServerC.agrium.com', 'ServerD.agrium.com') -- End permanently don't want entries -- Start temporary don't want entries AND s.HOST NOT IN ('CalgServer2 LIMSCP Status Monitor Service', 'CalgServer2 LIMSRP Event Monitor Service', 'CalgServer2 LIMSCP Event Monitor Service', 'CalgServer2 LIMSRP Status Monitor Service', 'CalgServer2 Redwater LIMS Prod Database') -- End temporary don't want entries AND s1.description = 'Up' UNION SELECT s.HOST, h.laststatuschange AS downat, TO_DATE (NULL, 'MM/DD/YYYY hh:mi:ss AM') AS upat FROM sa_stats_t s, (SELECT HOST, MAX (TO_DATE (laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') ) AS laststatuschange FROM sa_stats_t -- where to_date(laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') > sysdate -30 GROUP BY HOST) h WHERE s.status = 1 AND TO_DATE (s.laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') = h.laststatuschange -- Start permanently don't want entries AND s.HOST NOT IN ('Good Morning', 'CalgServer4 Server', 'ServerA.agrium.com', 'ServerB.agrium.com', 'ServerC.agrium.com', 'ServerD.agrium.com') -- End permanently don't want entries -- Start temporary don't want entries AND s.HOST NOT IN ('CalgServer2 LIMSCP Status Monitor Service', 'CalgServer2 LIMSRP Event Monitor Service', 'CalgServer2 LIMSCP Event Monitor Service', 'CalgServer2 LIMSRP Status Monitor Service', 'CalgServer2 Redwater LIMS Prod Database') -- End temporary don't want entries AND s.HOST = h.HOST GROUP BY s.HOST, h.laststatuschange; Here is the other view, which shows entries currently down: DROP VIEW APPDEV.SA_STATS_CURRENTLY_DOWN_V; /* Formatted on 2008/08/29 09:09 (Formatter Plus v4.8.8) */ CREATE OR REPLACE FORCE VIEW appdev.sa_stats_currently_down_v (HOST, downat) AS SELECT s.HOST, h.laststatuschange AS downat FROM sa_stats_t s, (SELECT HOST, MAX (TO_DATE (laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') ) AS laststatuschange FROM sa_stats_t -- where to_date(laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') > sysdate -30 GROUP BY HOST) h WHERE s.status = 1 AND TO_DATE (s.laststatuschange, 'MM/DD/YYYY hh:mi:ss AM') = h.laststatuschange -- Start permanently don't want entries AND s.HOST NOT IN ('Good Morning', 'CalgServer4 Server', 'ServerA.agrium.com', 'ServerB.agrium.com', 'ServerC.agrium.com', 'ServerD.agrium.com') -- End permanently don't want entries -- Start temporary don't want entries AND s.HOST NOT IN ('CalgServer2 LIMSCP Status Monitor Service', 'CalgServer2 LIMSRP Event Monitor Service', 'CalgServer2 LIMSCP Event Monitor Service', 'CalgServer2 LIMSRP Status Monitor Service', 'CalgServer2 Redwater LIMS Prod Database') -- End temporary don't want entries AND s.HOST = h.HOST GROUP BY s.HOST, h.laststatuschange ORDER BY h.laststatuschange;
I hope this helps, Brett Hanson >>> "Dirk" <[EMAIL PROTECTED]> 8/29/2008 8:50 AM >>> If SA isn't running that it can detect that change.not much we can do about that. Would this also mean that you would want (example): Check Name Down At Up At Outage Duration LDAP Server 27-Aug 4:06 PM 27-Aug 4:32 PM 26 Minutes LDAP Server 27-Aug 7:06 PM 27-Aug 7:30 PM 24 Minutes LDAP Server 28-Aug 4:06 AM 28-Aug 7:30 PM 204 Minutes is the entry went down several times in the last week? Also for the week starting on Aug 24th (at midnight), how would you handle the fact that the entry went down on the 23rd at 11:58pm? Should it mark this down at 23rd 11:58pm or starting 24th at midnight? Dirk Bulinckx. From: Servers Alive Discussion List [mailto:[EMAIL PROTECTED] On Behalf Of Brett Hanson Sent: Friday, August 29, 2008 4:41 PM To: Servers Alive Discussion List Subject: Re: [SA-list] Servers Alive and reporting One report we've been struggling with for the last couple years is an activity report. We'd like to see a report that shows outages detected by Servers Alive in the past 7 days. This report would show the check name, when the check went down, when it came back up and the duration. Any checks down when the report is run would show as 'Currently Down'. For example: Check Name Down At Up At Outage Duration LDAP Server 27-Aug 4:06 PM 27-Aug 4:32 PM 26 Minutes Service A 29-Aug 3:10 AM Currently Down 5 Hours 17 Minutes Our biggest issue is problems with accuracy that result when Servers Alive is restarted - a check that was down before Servers Alive was shut down and was up when Servers Alive started again is not detected as a status change, and no database record showing the transition exists. Regards, Brett Hanson Systems Analyst Agrium >>> "Dirk" <[EMAIL PROTECTED]> 8/29/2008 3:50 AM >>> One of the often returning question on Servers Alive is if it can do reporting. We always point to the HTML template based output and to the DB logging (using a 3rd party report writer). Still it seems that this is not what people are looking for. That's why we would like you to help us with some brainstorming around that reporting feature. I'll start by giving my own idea on it. * it's based on the HTML template based output * it can be set to be executed (generated) once a day and you can select what "entries" go on it * you can ofcourse have several output's and for several sets of entries * additional parameters are needed like % up cycles % down cycles % maintenance cycles and this per DAY, WEEK, MONTH, YEAR with "easy" access to the current (day/week/...) and the previous (day/week/...) and also access to other days/weeks/months/year. Example: <sa_stats_up_week{pervious}%>gives the up% of the previous week <sa_stats_up_week082008%>gives the up% of week 8 of 2008 <sa_stats_down_month082008%>gives the up% of month 8 of 2008 All ideas/comments/additions are MORE THEN WELCOME Dirk Bulinckx. To unsubscribe send a message with UNSUBSCRIBE in the subject line to salive@woodstone.nu If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list. _____________________________________________________________ IMPORTANT NOTICE ! This E-Mail transmission and any accompanying attachments may contain confidential information intended only for the use of the individual or entity named above. Any dissemination, distribution, copying or action taken in reliance on the contents of this E-Mail by anyone other than the intended recipient is strictly prohibited and is not intended to, in anyway, waive privilege or confidentiality. If you have received this E-Mail in error please immediately delete it and notify sender at the above E-Mail address. Agrium uses state of the art anti-virus technology on all incoming and outgoing E-Mail. We encourage and promote the use of safe E-Mail management practices and recommend you check this, and all other E-Mail and attachments you receive for the presence of viruses. The sender and Agrium accept no liability for any damage caused by a virus or otherwise by the transmittal of this E-Mail. _____________________________________________________________ To unsubscribe send a message with UNSUBSCRIBE in the subject line to salive@woodstone.nu If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list. To unsubscribe send a message with UNSUBSCRIBE in the subject line to salive@woodstone.nu If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list. _____________________________________________________________ IMPORTANT NOTICE ! This E-Mail transmission and any accompanying attachments may contain confidential information intended only for the use of the individual or entity named above. Any dissemination, distribution, copying or action taken in reliance on the contents of this E-Mail by anyone other than the intended recipient is strictly prohibited and is not intended to, in anyway, waive privilege or confidentiality. If you have received this E-Mail in error please immediately delete it and notify sender at the above E-Mail address. Agrium uses state of the art anti-virus technology on all incoming and outgoing E-Mail. We encourage and promote the use of safe E-Mail management practices and recommend you check this, and all other E-Mail and attachments you receive for the presence of viruses. The sender and Agrium accept no liability for any damage caused by a virus or otherwise by the transmittal of this E-Mail. _____________________________________________________________ To unsubscribe send a message with UNSUBSCRIBE in the subject line to salive@woodstone.nu If you use auto-responders (like out-of-the-office messages), make sure that they are not sent to the list nor to individual members. Doing so will cause you to be automatically removed from the list.