On our second instance, I have a external errorlevel check that runs a VBScript to determine the file creation time on the web page.
Thanks Phillip Carter Ph: +61 3 9235 1691 -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of David Webster Sent: Thursday, 6 January 2005 7:43 AM To: salive@woodstone.nu Subject: [SA-list] Who watches the watchers, and how? Hello, I am running SA v4.1.1609 on a server running windows server 2003. I have the software running as a service, performing 3 dozen checks or so, a mix of pings, windows services, etc. Each check outputs to the default web page which is then rendered using a template. Checks are done every three minutes M-f 9-5, and then every 10 minutes otherwise. All checks and alerts are working well, however every few weeks the service crashes and I get "Event ID 7034 The Servers Alive service terminated unexpectedly. It has done this 1 time(s)" message in the system event log. There is no other information available. There are no other unusual events recorded in the OS logs at or near the time of the SA service failure. My problems are: 1)when this happens the service cannot be reliably re-started. It can be "started" via the services MMC snap-in, but the checks do not run and the web page does not get updated. Further attempts to manage the service via the MMC are met with a "service did not respond in a timely blah blah blah." If I terminate the re-started service using task manager, and then run the SA application software (instead of using the service) the check cycle will run once and update the web page, but after, checks are not run and the web page is not updated. Also, attempts to exit the application are ignored, and task manager shows it as Not Responding. Only a server re-boot will fix the problem. Can anyone suggest why this might be happening, places to look for clues as to the cause, or ways to recover the SA service short of a re-boot? 2)I do not have a reliable way of being alerted when the SA service has crashed on this server. I user another instance of SA on a another server to check to see if the SA service is running on the first server, but it fails to alert me when the SA service crashes under these circumstances. (Yes Dirk, I have the check and alert configured properly.) The way I discover the failure is I visit the web page and the last updated time is way old. Has anyone ever tried to compare the last updated time on the HTML page to the current server time, and then alert if the difference is greater than the check cycle frequency? Are there other methods of monitoring the health of SA? Thanks for reading this longish post and for any suggestions (related to SA administration and troubleshooting) readers can provide. David ------------------------- [This E-mail scanned for viruses by Declude Virus] To unsubscribe from a list, send a mail message to [EMAIL PROTECTED] With the following in the body of the message: unsubscribe SAlive IMPORTANT DISCLAIMER - THIS MAY AFFECT YOUR LEGAL RIGHTS: Because this document has been prepared without consideration of any specific clients investment objectives, financial situation or needs, a Bell Potter Securities Limited investment adviser should be consulted before any investment decision is made. While this document is based on the information from sources which are considered reliable, Bell Potter Securities Limited, its directors, employees and consultants do not represent, warrant or guarantee, expressly or impliedly, that the information contained in this document is complete or accurate. Nor does Bell Potter Securities Limited accept any responsibility to inform you of any matter that subsequently comes to its notice, which may affect any of the information contained in this document. This document is a private communication to clients and is not intended for public circulation or for the use of any third party, without the prior approval of Bell Potter Securities. Disclosure of Interest: Bell Potter Securities Limited receives commission from dealing in securities and its authorised representatives, or introducers of business, may directly share in this commission. Bell Potter Securities and its associates may hold shares in the companies recommended. Bell Potter Securities Limited ABN 25 006 390 772 AFS Licence No. 243480 ------------------------- [This E-mail scanned for viruses by Declude Virus] To unsubscribe from a list, send a mail message to [EMAIL PROTECTED] With the following in the body of the message: unsubscribe SAlive