The heartbeat-monitor tip is right on, cluther... good stuff. Just a couple comments for NickH:
AutoMate also has the built-in ability to write to the Event Log. In my case, I have every AM job set to write to the Event Log if there's an error, so those magically show up in Zenoss quite quickly. Dunno if in your case the job that is freezing up the service is erroring (probably not), but it is a useful thing to monitor anyway. And secondly, a point of encouragement: with some elbow grease you'll probably be able to fix the offending AutoMate job. I've found that while AM does a poor job of handling unexpected problems (screen prompts, network destination down, etc.) on its own, you can build more failsafes, timeouts, and success/failure tests into the jobs so they detect the problem, fail gracefully, and send notifications. -------------------- m2f -------------------- Read this topic online here: http://forums.zenoss.com/viewtopic.php?p=37274#37274 -------------------- m2f -------------------- _______________________________________________ zenoss-users mailing list [email protected] http://lists.zenoss.org/mailman/listinfo/zenoss-users
