On Mar 29, 2011, at 11:34 PM, Michael Schwartzkopff wrote: >> On Mar 29, 2011 6:12 AM, "Michael Schwartzkopff" <mi...@clusterbau.com> >> >> wrote: >>>> On Tue, Mar 29, 2011 at 3:29 AM, Vadym Chepkov <vchep...@gmail.com> >> >> wrote: >>>>> On Mar 24, 2011, at 12:46 AM, Rakesh K wrote: >>>>>> Hi ALL >>>>>> Is there any way to send Email notifications when a resource is >> >> failure >> >>>>>> in the cluster frame work. >>>>>> >>>>>> while i was going through the Pacemaker-explained document provided >> >> in >> >>>>>> the website www.clusterlabs.org >>>>>> >>>>>> There was no content in the chapter 7 --> which is sending email >>>>>> notification events. >>>>>> >>>>>> can anybody help me regarding this. >>>>>> >>>>>> for know i am approaching the crm_mon --daemonize --as-html <path ot >>>>>> fil> to maintain the status of HA in html file. >>>>>> >>>>>> Is there any other approach for sending email notification. >>>>> >>>>> Last time I checked, crm_mon is not well suited for this purpose. >>>>> >>>>> crm_mon has the following option >>>>> >>>>> -T, --mail-to=value >>>>> >>>>> Send Mail alerts to this user. See also >>>>> >>>>> --mail-from, --mail-host, --mail-prefix >>>>> >>>>> But you will end-up with obscene amount of e-mails, I was blocked >>>>> from gmail when I tried to use it once :) For one resource failure >>>>> you will get 4 e-mails: monitor,stop,start,monitor. Now imagine if >>>>> it was a >> >> most >> >>>>> significant member of a group or worse, node failure... >>>>> >>>>> nagios would be better suited for this purpose, but, unfortunately, >>>>> crm_mon is broken >>>>> (http://developerbugs.linux-foundation.org/show_bug.cgi?id=2344) for >>>>> quite awhile. >>>> >>>> The fix is going to have to come from the community, I don't have any >>>> knowledge of nagios >>>> >>>>> I am yet to find a good monitoring solution for pacemaker, hopefully >>>>> somebody had more success and will share. >>> >>> Use SNMP. It is the standard protocol for monitoring. Add a "extend" line >> >> to >> >>> your snmpd.conf to call a script that returns the number of failcounts. >> >> You >> >>> can easily monitoring this with every NMS. For nagios use check_snmp. >> >> I afraid it won't be able to tell more then "stuff happened" :( >> Would it? > > Yes. Like a good NMS always does. To analyse the error you still have to read > the logs yourself. >
What I meant was, I can't see how one "extend" line will be able to supply specifics about what exactly resource has failed. Would you kindly share en example? I was trying to integrate crm_mon with SNMP Trap Translator (snmptt), but haven't had luck with it either. I posted details in another thread. Lack of "out-of-the-box" monitoring solution for pacemaker is a major deficiency in my daily use, I am sure I am not alone. Maybe it's out there, but Chapter 7 of "Pacemaker Explained" is yet to be written. Thanks, Vadym _______________________________________________ Pacemaker mailing list: Pacemaker@oss.clusterlabs.org http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker