Greetings, all.  I’m running on the latest files from CVS.  I’m trying to setup a test environment where one server (named branch-1) will alert a master server (“mainmonitor”) in the event of a problem.  I can get the branch-1 server to recognize a problem and send an alert to the master server, and the master server will show that trap being received (in /var/log/messages), but I can’t get the mainmonitor server to actually then kick off an alert action.

 

Here’s my pertinent configuration for the “lower” server (from mon.cf):

 

watch OtherServer

    service DRBD_Status

        interval 15s

        monitor DRBDCheck.monitor -s you

        description Is\ DRBD\ working\ there?

        period wd {Mon-Sun}

           alert remote.alert -H mainmonitor

 

I can see that trap being sent via that server’s /var/log/messages:

 

Jul 12 16:17:05 branch-1 mon[2649]: failure for OtherServer DRBD_Status 1152739025 DRBD_Not_Running

Jul 12 16:17:05 branch-1 mon[2649]: calling alert remote.alert for OtherServer/DRBD_Status (/opt/mon/alert.d/remote.alert,-H mainmonitor) DRBD_Not_Running

 

On the “mainmonitor” server, my mon.cf config is:

 

watch default

     service default

     description Default trap service

     period wd {Mon-Sun}

         alert mail.alert [EMAIL PROTECTED]

 

I’ve got my auth.cf set to receive traps from anyone (* * *).

 

When mainmonitor gets one of the traps, I’ll see this in /var/log/messages:

 

Jul 11 16:20:04 monitor mon[2017]: trap received for undefined service type default/DRBD_Status

 

...but nothing will actually get kicked off and no mail is sent.  Also, the mon.cgi program (running on mainmonitor) will stay in the blue/unchecked status.

 

 

Interestingly enough, if I change my mainmonitor server’s mon.cf to be:

 

watch default

     service DRBD_Status

     description Default trap service

     period wd {Mon-Sun}

         alert mail.alert [EMAIL PROTECTED]

 

I’ll get this in the /var/log/messages:

 

Jul 11 16:24:49 monitor mon[2017]: trap trap 0 from  grp=default svc=DRBD_Status, sta=0

 

...but there is still nothing kicked off and the no mail gets sent.  But...the mon.cgi program then shows the default service group to be in the green/good status.

 

Any thoughts as to what’s going on here?  I’m trying to get this working:

 

-        An alert getting kicked off by the mainmonitor’s system when it receives a trap; and

-        The mon.cgi program on mainmonitor showing an alert status once its received that trap.

 

Thanks,

Tim

 

_______________________________________________
mon mailing list
[email protected]
http://linux.kernel.org/mailman/listinfo/mon

Reply via email to