Ok, I tried something slightly different. I modified the existing the udp.monitor (or was it the tcp.monitor) of mon and basically sending a "sniffed" SIP Registration packet which I send to the asterisk server. If I don't receive an answer within a set time. The monitor sends an error.
It tells you if the server is at least answering SIP. Mind you I once had a server freeze, but the monitoring kept getting an answer. So not 100% fool-proof, but save my *** in the past :-) -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Adam Moffett Sent: October 25, 2005 9:51 AM To: Asterisk Users Mailing List - Non-Commercial Discussion Subject: Re: [Asterisk-Users] Asterisk Redundency Benjamin Lawetz wrote: > > > >>Since I can't do that, what I've settled on is heartbeat + mon. >>Heartbeat will monitor for a system level failure and switch to the >>backup >> >> >machine if neccesary; and mon will watch the asterisk (or any > > >>other) service and restart it and/or alert me if it fails. >> >> > >What kind of monitor are you using to monitor asterisk? > > > > Sorry for my slow response. My asterisk monitor right now is embarrassingly simple. All it does is execute show uptime and look for output starting with "System", see below. Obviously the method has limitations. 1) It will only really only tell me that the daemon is running, not that it's able to carry any calls. 2) It only works on localhost. Input on how to test a remote instance of asterisk would be welcome, as well as a method of making a test call or reliably testing for the ability to make calls. My impression is that this would require asterisk to have a "Dial" command in the CLI, or a linux SIP client that I could execute from the shell. I'm not aware of the existence of either. Any other simple and reliable methods of testing asterisk's condition would be welcome. The alerts, by the way are pretty simple as well. See the excerpt from mon.cf below. restartasterisk.alert does exactly what it says. stopeverything.alert shuts down heartbeat, which will cause another node in the cluster to take over...in fact that node will start mon, which will then use the restartasterisk.alert to start up asterisk. Asterisk only starts on the backup machine when the primary fails so that config changes replicated from the primary will take effect. Total downtime should be < 3min. Which will let me hit 5-nine if it only happens once a year ;) Config changes are replicated via rsync and ssh every few minutes. Voicemails are also copied from primary to backup by rsync. One thing I still need to do is make rsync stop attempting to replicate files when the failover occurrs. That will probably just require another alert below the "stopeverything.alert". The replication of couse means that this setup will not protect me from a bad config change that breaks asterisk, as that change will be replicated throughout the cluster. So all significant config changes should be tested on a standalone box. [EMAIL PROTECTED] mon]# cat /usr/lib/mon/mon.d/asterisk.monitor #!/bin/sh ##can only check localhost. Always checks localhost regardless of input SHOW_UPTIME=`/usr/sbin/asterisk -rx "show uptime" | /bin/cut -b 1-6` if [ $SHOW_UPTIME == "System" ]; then exit 0 else echo "localhost" exit 1 fi From mon.cf: watch asterisk service asterisk description asterisk pbx on localhost interval 10s monitor asterisk.monitor period wd {Sun-Sat} alert mail.alert [EMAIL PROTECTED] alert restartasterisk.alert [EMAIL PROTECTED] alertevery 30s service asterisk-failover description checking if we need to stop heartbeat interval 10s monitor asterisk.monitor period wd {Sun-Sat} alert stopeverything.alert [EMAIL PROTECTED] alertafter 5 3m _______________________________________________ --Bandwidth and Colocation sponsored by Easynews.com -- Asterisk-Users mailing list Asterisk-Users@lists.digium.com http://lists.digium.com/mailman/listinfo/asterisk-users To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users _______________________________________________ --Bandwidth and Colocation sponsored by Easynews.com -- Asterisk-Users mailing list Asterisk-Users@lists.digium.com http://lists.digium.com/mailman/listinfo/asterisk-users To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users