Re: [Nagios-users] High Availability

Robert Holman Mon, 13 May 2013 11:45:54 -0700

I would look into MK_Livestatus as a backend. This would allow you to "cluster" 
web frontends, and by simply replicating config files across nodes, you could 
have "cold" standby servers in case the backend(s) actually fail.


Regards,
Rob


-----Original Message-----
From: nagios-users-requ...@lists.sourceforge.net 
[mailto:nagios-users-requ...@lists.sourceforge.net]
Sent: Saturday, May 11, 2013 5:31 AM
To: nagios-users@lists.sourceforge.net
Subject: Nagios-users Digest, Vol 84, Issue 3

Send Nagios-users mailing list submissions to
        nagios-users@lists.sourceforge.net

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.sourceforge.net/lists/listinfo/nagios-users
or, via email, send a message with subject or body 'help' to
        nagios-users-requ...@lists.sourceforge.net

You can reach the person managing the list at
        nagios-users-ow...@lists.sourceforge.net

When replying, please edit your Subject line so it is more specific than "Re: 
Contents of Nagios-users digest..."


Today's Topics:

   1. Re: Variables for determining time before first alert
      (Justin T Pryzby)
   2. High Availabilty with Nagios (Steve Shipway)
   3. Re: High Availabilty with Nagios
      (Supporto Tecnico - Crazy Network)
   4. Re: High Availabilty with Nagios (William Leibzon)
   5. Re: High Availabilty with Nagios (Edward St Pierre)
   6. Re: check_http with spaces problem (Claudio Kuenzler)
   7. Re: check_http with spaces problem (???????? ?????????)
   8. Re: check_http with spaces problem (Claudio Kuenzler)
   9. Re: High Availabilty with Nagios (Andrew Widdersheim)
  10. Re: High Availabilty with Nagios (frank)
  11. Re: High Availabilty with Nagios (Jim Winkle)
  12. Re: High Availabilty with Nagios (Andreas Ericsson)
  13. Re: High Availabilty with Nagios (Andreas Ericsson)
  14. Trying to figure out the PCRE expression for      Nagiosgraph Map
      (Percy Kwong)
  15. Re: Trying to figure out the PCRE expression for Nagiosgraph
      Map (Claudio Kuenzler)
  16. Re: servicegroup overview not restricted for htaccess users
      (Jonas Meurer)
  17. Re: Trying to figure out the PCRE expression for Nagiosgraph
      Map (Percy Kwong)


----------------------------------------------------------------------

Message: 1
Date: Tue, 7 May 2013 22:14:17 -0700
From: Justin T Pryzby <just...@norchemlab.com>
Subject: Re: [Nagios-users] Variables for determining time before
        first alert
To: nagios-users@lists.sourceforge.net
Message-ID: <20130508051417.ga28...@norchemlab.com>
Content-Type: text/plain; charset=us-ascii

On Wed, May 08, 2013 at 12:33:19AM -0400, Alex wrote:
> > http://nagios.sourceforge.net/docs/3_0/objectdefinitions.html
>
> Thanks for your help. I've actually read quite a bit of that, and I'm
> still confused. It wasn't clear that max_check_attempts is the number
> of attempts that are made for each iteration, before another alert is

http://nagios.sourceforge.net/docs/3_0/notifications.html

max_check_attempts is the number of FAILED attempts (each made "retry_interval" 
after the previous failing attempt) before a service moves from a "soft" 
failure state to a "hard" failure state.  Notifies are sent when 
max_check_attempts have been made, and the service is then in a "hard" state.  
Notifies are also sent when a hard-failing services is rechecked (at 
"check_interval"), and at least notification_interval has passed since the last 
notify.

Justin



------------------------------

Message: 2
Date: Thu, 9 May 2013 09:19:17 +0000
From: Steve Shipway <s.ship...@auckland.ac.nz>
Subject: [Nagios-users] High Availabilty with Nagios
To: "nagios-users@lists.sourceforge.net"
        <nagios-users@lists.sourceforge.net>
Message-ID:
        
<7294716191a1e142b80615ed2c633bca6830f...@uxcn10-tdc02.uoa.auckland.ac.nz>

Content-Type: text/plain; charset="iso-8859-1"

Does anyone have an HA setup for Nagios that works?

I'm thinking of creating a NEB module that will link two Nagios setups, and 
replicate over all status changes, config changes, downtime, comments, etc etc 
and then set the 'standby' Nagios to be checks/notifications disabled when in 
standby mode, and enabled when in active mode.  Then put the two behind a 
failover load balancer (F5, Foundry or apache reverse proxy).

However this would be too much work if someone else has already found an 
equivalent solution.

I've looked at Merlin but it doesn't seem to do what I'm after (and the 
documentation is practically nonexistant - much the same as the NEB API 
documentation, in fact).  Mod_gearman lets me have redundant checks and 
replicate *active* checks, but not commands, downtime or passive checks.

Does anyone out there have a workable way to get an active/standby or 
active/active Nagios setup?  Would be interested in hearing all ideas...

Steve


Steve Shipway
University of Auckland ITS
UNIX Systems Design Lead
s.ship...@auckland.ac.nz<mailto:s.ship...@auckland.ac.nz>
Ph: +64 9 373 7599 ext 86487

-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 3
Date: Thu, 09 May 2013 11:50:02 +0200
From: Supporto Tecnico - Crazy Network <supp...@crazynetwork.it>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: nagios-users@lists.sourceforge.net
Message-ID: <518b714a.1040...@crazynetwork.it>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

I would be interested too, i'm actually using merlind for this right now, but i 
would like to dont have for example double notifications if a server goes 
down.. and i do want both nagios set for notify, since if one is down (for any 
reason) the other one should be able to check and notify and vice-versa....

Regards


Il 09/05/2013 11:19, Steve Shipway ha scritto:
> Does anyone have an HA setup for Nagios that works?
>
> I'm thinking of creating a NEB module that will link two Nagios
> setups, and replicate over all status changes, config changes,
> downtime, comments, etc etc and then set the 'standby' Nagios to be
> checks/notifications disabled when in standby mode, and enabled when
> in active mode.  Then put the two behind a failover load balancer (F5,
> Foundry or apache reverse proxy).
>
> However this would be too much work if someone else has already found
> an equivalent solution.
>
> I've looked at Merlin but it doesn't seem to do what I'm after (and
> the documentation is practically nonexistant - much the same as the
> NEB API documentation, in fact).  Mod_gearman lets me have redundant
> checks and replicate *active* checks, but not commands, downtime or
> passive checks.
>
> Does anyone out there have a workable way to get an active/standby or
> active/active Nagios setup?  Would be interested in hearing all ideas...
>
> Steve
>
>
> *Steve Shipway*
> University of Auckland ITS
> /UNIX Systems Design Lead/
> s.ship...@auckland.ac.nz <mailto:s.ship...@auckland.ac.nz>
> Ph: +64 9 373 7599 ext 86487
> //
>
>
> ----------------------------------------------------------------------
> -------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
> Databases" is the definitive new guide to graph databases and their
> applications. This 200-page book is written by three acclaimed leaders
> in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>
>
> _______________________________________________
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


--
Andrea Iannucci
----------------------------

----------------------------
Crazy Network di Iannucci Andrea
Viale G.B. Lulli, 24
00050 Cerveteri - RM
(w) www.crazynetwork.it
(e) andrea.iannu...@crazynetwork.it
(t) +39 06 62279876
(f) +39 06 62298767
(m) +39 338 8552885

-------------------------------------------------------------------------------
Please consider our enviromental responsabilit? before printing this E-Mail. 
Thank you.
-------------------------------------------------------------------------------
Questo messaggio di posta elettronica contiene informazioni di carattere 
confidenziale rivolte esclusivamente al destinatario sopra indicato.
E' vietato l'uso, la diffusione, distribuzione o riproduzione da parte di ogni 
altra persona. Nel caso aveste ricevuto questo messaggio di posta elettronica 
per errore, siete pregati di segnalarlo immediatamente al mittente e 
distruggere quanto ricevuto (compresi i file allegati) senza farne copia.
Qualsivoglia utilizzo non autorizzato del contenuto di questo messaggio 
costituisce violazione dell'obbligo di non prendere cognizione della 
corrispondenza tra altri soggetti, salvo pi? grave illecito, ed espone il 
responsabile alle relative conseguenze.
--------------------------------------------------------------------------------
This e-mail is confidential and may also contain privileged information.
If you are not the intended recipient you are not authorised to read, print, 
save, process or disclose this message. If you have received this message by 
mistake, please inform the sender immediately and delete this e-mail, its 
attachments and any copies.

Any use, distribution, reproduction or disclosure by any person other than the 
intended recipient is strictly prohibited and the person responsible may incur 
penalties.
--------------------------------------------------------------------------------





------------------------------

Message: 4
Date: Thu, 9 May 2013 02:51:57 -0700
From: William Leibzon <will...@leibzon.org>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users@lists.sourceforge.net>
Message-ID:
        <CAFCy1BiXoic=Jcq+kh-jr_yBCWVEk2EPi6ZhZUTO00=7jbf...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

On Thu, May 9, 2013 at 2:19 AM, Steve Shipway <s.ship...@auckland.ac.nz> wrote:
> Does anyone have an HA setup for Nagios that works?
>
> I'm thinking of creating a NEB module that will link two Nagios
> setups, and replicate over all status changes, config changes,
> downtime, comments, etc etc and then set the 'standby' Nagios to be
> checks/notifications disabled when in standby mode, and enabled when
> in active mode.  Then put the two behind a failover load balancer (F5, 
> Foundry or apache reverse proxy).

I've thought several times of doing it but never actually get started although 
I have it all planned out kinda like you.

In the mean time my HA setup which I've done for several customers involves 
config synced using git or svn (script run by cron that checks if its something 
new and then restart nagios if config passes tests). Both servers doing checks 
but config is such that for one server all notifications are disabled except 
for cross-checking of the other nagios This is achieved by having common 
template from which all services are derived and this template is in a file 
specific to each server and so one has notifications disabled and the other 
enabled.
This is not a full HA in a way that if one server dies you have to execute a 
script that would enable the other servers for notifications (this can be done 
automatically too but I prefer people to do it).

> However this would be too much work if someone else has already found
> an equivalent solution.
>
> I've looked at Merlin but it doesn't seem to do what I'm after (and
> the documentation is practically nonexistant - much the same as the
> NEB API documentation, in fact).  Mod_gearman lets me have redundant
> checks and replicate *active* checks, but not commands, downtime or passive 
> checks.
>
> Does anyone out there have a workable way to get an active/standby or
> active/active Nagios setup?  Would be interested in hearing all ideas...
>
> Steve
>
>
> Steve Shipway
> University of Auckland ITS
> UNIX Systems Design Lead
> s.ship...@auckland.ac.nz
> Ph: +64 9 373 7599 ext 86487
>
>
> ----------------------------------------------------------------------
> -------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
> Databases" is the definitive new guide to graph databases and their
> applications. This 200-page book is written by three acclaimed leaders
> in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null



------------------------------

Message: 5
Date: Thu, 9 May 2013 10:59:30 +0100
From: Edward St Pierre <edward.stpie...@gmail.com>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users@lists.sourceforge.net>
Message-ID:
        <CAHryeXGwcWNougRAnS4A+Z27K5ephjRj3TaLfXtiR3Uaj==v...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,

I have done this before using drbd for block based replication and clustering 
on Redhat, this also could be done with pacemaker/corrosync clusters also.

Ed


On 9 May 2013 10:51, William Leibzon <will...@leibzon.org> wrote:

> On Thu, May 9, 2013 at 2:19 AM, Steve Shipway
> <s.ship...@auckland.ac.nz>
> wrote:
> > Does anyone have an HA setup for Nagios that works?
> >
> > I'm thinking of creating a NEB module that will link two Nagios
> > setups,
> and
> > replicate over all status changes, config changes, downtime,
> > comments,
> etc
> > etc and then set the 'standby' Nagios to be checks/notifications
> > disabled when in standby mode, and enabled when in active mode.
> > Then put the two behind a failover load balancer (F5, Foundry or apache 
> > reverse proxy).
>
> I've thought several times of doing it but never actually get started
> although I have it all planned out kinda like you.
>
> In the mean time my HA setup which I've done for several customers
> involves config synced using git or svn (script run by cron that
> checks if its something new and then restart nagios if config passes
> tests). Both servers doing checks but config is such that for one
> server all notifications are disabled except for cross-checking of the
> other nagios This is achieved by having common template from which all
> services are derived and this template is in a file specific to each
> server and so one has notifications disabled and the other enabled.
> This is not a full HA in a way that if one server dies you have to
> execute a script that would enable the other servers for notifications
> (this can be done automatically too but I prefer people to do it).
>
> > However this would be too much work if someone else has already
> > found an equivalent solution.
> >
> > I've looked at Merlin but it doesn't seem to do what I'm after (and
> > the documentation is practically nonexistant - much the same as the
> > NEB API documentation, in fact).  Mod_gearman lets me have redundant
> > checks and replicate *active* checks, but not commands, downtime or passive 
> > checks.
> >
> > Does anyone out there have a workable way to get an active/standby
> > or active/active Nagios setup?  Would be interested in hearing all ideas...
> >
> > Steve
> >
> >
> > Steve Shipway
> > University of Auckland ITS
> > UNIX Systems Design Lead
> > s.ship...@auckland.ac.nz
> > Ph: +64 9 373 7599 ext 86487
> >
> >
> >
> ----------------------------------------------------------------------
> --------
> > Learn Graph Databases - Download FREE O'Reilly Book "Graph
> > Databases" is the definitive new guide to graph databases and their
> > applications. This 200-page book is written by three acclaimed
> > leaders in the field. The early access version is available now.
> > Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and OS when
> reporting
> > any issue.
> > ::: Messages without supporting info will risk being sent to
> > /dev/null
>
>
> ----------------------------------------------------------------------
> -------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
> Databases" is the definitive new guide to graph databases and their
> applications. This 200-page book is written by three acclaimed leaders
> in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 6
Date: Thu, 9 May 2013 13:23:24 +0200
From: Claudio Kuenzler <c...@claudiokuenzler.com>
Subject: Re: [Nagios-users] check_http with spaces problem
To: Nagios Users List <nagios-users@lists.sourceforge.net>
Message-ID:
        <CAF-yqgj3Y=56mag2fh4lci4ft-j4wwczn4esplmkptypoq5...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

> Sun May 5 22:29:03 EEST 2013 /usr/lib64/nagios/plugins/check_http -H
> granma.gr -u http://granma.gr/index.html -R "Web " -w 10 -c 20 Name or
> service not known HTTP CRITICAL - Unable to open TCP socket
>

You have to break up the -u argument. -u expects the path, not the complete 
URI. So in this case:

/usr/lib64/nagios/plugins/check_http -H granma.gr -u /index.html -R "Web"
-w 10 -c 20
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 7
Date: Thu, 9 May 2013 14:29:54 +0300
From: ???????? ????????? <dkokma...@gmail.com>
Subject: Re: [Nagios-users] check_http with spaces problem
To: Nagios Users List <nagios-users@lists.sourceforge.net>
Message-ID:
        <CAFY9zEw92mX255_sA_5i+TLmxqxn0B=qqz4-4udkwcsxvtp...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Thank you for the answer,

The problem doesn't seem to be at the url but at the -R option

If I use -R "Web" the response is ok but if i use -R "Web somethin" it returns 
error!


2013/5/9 Claudio Kuenzler <c...@claudiokuenzler.com>

>
> Sun May 5 22:29:03 EEST 2013 /usr/lib64/nagios/plugins/check_http -H
>> granma.gr -u http://granma.gr/index.html -R "Web " -w 10 -c 20 Name
>> or service not known HTTP CRITICAL - Unable to open TCP socket
>>
>
> You have to break up the -u argument. -u expects the path, not the
> complete URI. So in this case:
>
> /usr/lib64/nagios/plugins/check_http -H granma.gr -u /index.html -R "Web"
> -w 10 -c 20
>
>
>
> ----------------------------------------------------------------------
> -------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
> Databases" is the definitive new guide to graph databases and their
> applications. This 200-page book is written by three acclaimed leaders
> in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 8
Date: Thu, 9 May 2013 13:45:09 +0200
From: Claudio Kuenzler <c...@claudiokuenzler.com>
Subject: Re: [Nagios-users] check_http with spaces problem
To: Nagios Users List <nagios-users@lists.sourceforge.net>
Message-ID:
        <CAF-yqggd5xXGv6QohgbLS7aG8W5BFKYQ=ep3gz4=sn9iisv...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

>
> If I use -R "Web" the response is ok but if i use -R "Web somethin" it
> returns error!
>

Because the pattern needs to exist in the source code.

./check_http -H granma.gr -u /index.html -R "Web somethin"
HTTP CRITICAL: HTTP/1.1 200 OK - pattern not found - 4342 bytes in 0.126 second 
response time |time=0.125853s;;;0.000000 size=4342B;;;0

./check_http -H granma.gr -u /index.html -R "Web Design"
HTTP OK: HTTP/1.1 200 OK - 4342 bytes in 0.125 second response time
|time=0.124846s;;;0.000000 size=4342B;;;0




>
>
> 2013/5/9 Claudio Kuenzler <c...@claudiokuenzler.com>
>
>>
>> Sun May 5 22:29:03 EEST 2013 /usr/lib64/nagios/plugins/check_http -H
>>> granma.gr -u http://granma.gr/index.html -R "Web " -w 10 -c 20 Name
>>> or service not known HTTP CRITICAL - Unable to open TCP socket
>>>
>>
>> You have to break up the -u argument. -u expects the path, not the
>> complete URI. So in this case:
>>
>> /usr/lib64/nagios/plugins/check_http -H granma.gr -u /index.html -R
>> "Web" -w 10 -c 20
>>
>>
>>
>> ---------------------------------------------------------------------
>> --------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
>> Databases" is the definitive new guide to graph databases and their
>> applications. This 200-page book is written by three acclaimed
>> leaders in the field. The early access version is available now.
>> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
>>
>> _______________________________________________
>> Nagios-users mailing list
>> Nagios-users@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to
>> /dev/null
>>
>
>
>
> ----------------------------------------------------------------------
> -------- Learn Graph Databases - Download FREE O'Reilly Book "Graph
> Databases" is the definitive new guide to graph databases and their
> applications. This 200-page book is written by three acclaimed leaders
> in the field. The early access version is available now.
> Download your free book today! http://p.sf.net/sfu/neotech_d2d_may
> _______________________________________________
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 9
Date: Thu, 9 May 2013 10:48:54 -0400
From: Andrew Widdersheim <awiddersh...@hotmail.com>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users@lists.sourceforge.net>
Message-ID: <snt143-w18b68e12bd581b3471ca95dd...@phx.gbl>
Content-Type: text/plain; charset="iso-8859-1"

I did a talk at last years conference that touches on HA Nagios setup which 
uses DRBD and pacemaker. There were also talks about mod_gearman and Merlin 
that might also be helpful. The slides (and maybe video?) are available on 
nagios.org. Here is a link to my slides:

http://www.slideshare.net/nagiosinc/andrew-widdersheim-nagiosisdownbosswantstosee-you


------------------------------

Message: 10
Date: Thu, 9 May 2013 11:33:53 -0500 (CDT)
From: frank <ra...@they.org>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users@lists.sourceforge.net>
Message-ID: <alpine.lrh.2.03.1305091125050.5...@they.org>
Content-Type: text/plain; charset="iso-8859-1"

While HA can be a great thing I've always been of the opinion that a monitoring 
setup needs to have as few moving parts as possible. The more complexity to the 
monitor, the more chance you'll be chasing monitoring issues rather than site 
issues. And everthing you add on top of the monitor also needs to be monitored. 
So somehow that F5 is going to need an out-of-band monitor because if it dies 
then your Nagios host may well not have a way to contact you about it unless 
you've dual homed it which brings up a whole other set of issues.

The closest I got to HA at my last gig was creating a CNAME for the active 
Nagios host so in a failover you point the CNAME to the new box and at least 
passive checks can still roll in (after DNS timeout of course, which I say is 
better than reconfiging every NSCA clent).

-f

On Thu, 9 May 2013, Steve Shipway wrote:

> Date: Thu, 9 May 2013 09:19:17 +0000
> From: Steve Shipway <s.ship...@auckland.ac.nz>
> Reply-To: Nagios Users List <nagios-users@lists.sourceforge.net>
> To: "nagios-users@lists.sourceforge.net" <nagios-users@lists.sourceforge.net>
> Subject: [Nagios-users] High Availabilty with Nagios
>
> Does anyone have an HA setup for Nagios that works?
>
> I'm thinking of creating a NEB module that will link two Nagios setups, and 
> replicate over all
> status changes, config changes, downtime, comments, etc etc and then set the 
> 'standby' Nagios to
> be checks/notifications disabled when in standby mode, and enabled when in 
> active mode.? Then
> put the two behind a failover load balancer (F5, Foundry or apache reverse 
> proxy).
>
> However this would be too much work if someone else has already found an 
> equivalent solution.
>
> I've looked at Merlin but it doesn't seem to do what I'm after (and the 
> documentation is
> practically nonexistant - much the same as the NEB API documentation, in 
> fact).? Mod_gearman
> lets me have redundant checks and replicate *active* checks, but not 
> commands, downtime or
> passive checks.
>
> Does anyone out there have a workable way to get an active/standby or 
> active/active Nagios
> setup?? Would be interested in hearing all ideas...
>
> Steve
>
>
> Steve Shipway
> University of Auckland ITS
> UNIX Systems Design Lead
> s.ship...@auckland.ac.nz
> Ph: +64 9 373 7599 ext 86487
> ?
>
>

------------------------------

Message: 11
Date: Thu, 09 May 2013 13:33:50 -0500
From: Jim Winkle <jrwin...@wisc.edu>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users@lists.sourceforge.net>
Message-ID: <7750eefa25b76.518ba...@wiscmail.wisc.edu>
Content-Type: text/plain; CHARSET=US-ASCII

On 05/09/13, Steve Shipway  wrote:

> Does anyone have an HA setup for Nagios that works?
>
> I'm thinking of creating a NEB module that will link two Nagios setups, and 
> replicate over all status changes, config changes, downtime, comments, etc 
> etc and then set the 'standby' Nagios to be checks/notifications disabled 
> when in standby mode, and enabled when in active mode. Then put the two 
> behind a failover load balancer (F5, Foundry or apache reverse proxy).

We use rsync (run out of cron every minute) and a floating VIP between two 
hosts. Nagios is running on only one host at a time. It's a trivial (manual) 
process to switch between hosts.

Files which are synced: all Nagios files except logs and transient results. 
Files synced include Nagios configs, binaries and CGIs, helper apps, plugins, 
local plugins and NRPE configs, docs, HTML files, status files, all files in 
~nagios, and the crontab for user nagios.

-- Jim



------------------------------

Message: 12
Date: Fri, 10 May 2013 10:57:28 +0200
From: Andreas Ericsson <a...@op5.se>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: supp...@crazynetwork.it,    Nagios Users List
        <nagios-users@lists.sourceforge.net>
Message-ID: <518cb678.1080...@op5.se>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 2013-05-09 11:50, Supporto Tecnico - Crazy Network wrote:
> I would be interested too, i'm actually using merlind for this right
> now, but i would like to dont have for example double notifications if a
> server goes down.. and i do want both nagios set for notify, since if
> one is down (for any reason) the other one should be able to check and
> notify and vice-versa....
>

Double notifications is a bug, unless you send passive checkresults to
both masters, in which case it's by design. Usually people want to solve
passive checks by arranging a single target ip or hostname to send to
and then add peered nodes at that tier as necessary, so as to not have
to send checkresults to multiple nodes from all the monitored machines.

--
Andreas Ericsson                   andreas.erics...@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.



------------------------------

Message: 13
Date: Fri, 10 May 2013 10:58:12 +0200
From: Andreas Ericsson <a...@op5.se>
Subject: Re: [Nagios-users] High Availabilty with Nagios
To: Nagios Users List <nagios-users@lists.sourceforge.net>
Message-ID: <518cb6a4.4040...@op5.se>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 2013-05-09 11:19, Steve Shipway wrote:
> Does anyone have an HA setup for Nagios that works?
>
> I'm thinking of creating a NEB module that will link two Nagios
> setups, and replicate over all status changes, config changes,
> downtime, comments, etc etc and then set the 'standby' Nagios to be
> checks/notifications disabled when in standby mode, and enabled when
> in active mode.  Then put the two behind a failover load balancer
> (F5, Foundry or apache reverse proxy).
>
> However this would be too much work if someone else has already found
> an equivalent solution.
>
> I've looked at Merlin but it doesn't seem to do what I'm after (and
> the documentation is practically nonexistant - much the same as the
> NEB API documentation, in fact).  Mod_gearman lets me have redundant
> checks and replicate *active* checks, but not commands, downtime or
>passive checks.


Merlin would do exactly that if you set one of the nodes as a poller
but having all hosts assigned to it. When the poller goes down, the
master will by default take over checks for it.

Merlin is actually pretty well documented, but as textfiles that you
have to read the oldschool way. If there's anything you find lacking
from the HOWTO document or the README, please let me know and I'll
amend it.

>
> Does anyone out there have a workable way to get an active/standby or
> active/active Nagios setup?  Would be interested in hearing all
> ideas...
>

Well, we have about 800 of them.

--
Andreas Ericsson                   andreas.erics...@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.



------------------------------

Message: 14
Date: Fri, 10 May 2013 15:46:38 -0400
From: Percy Kwong <p...@psk.net>
Subject: [Nagios-users] Trying to figure out the PCRE expression for
        Nagiosgraph Map
To: nagios-users@lists.sourceforge.net
Message-ID: <518d4e9e.1030...@psk.net>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

I'm writing a pcre rule for a nagios map file.

The output for one query would be:

PROCS OK: 11 processes with args 'apache'

What would the map rule look like that would do the following?

1. Begin with "PROCS OK:"
2. End with "args 'apache'"
3. Extract only the numeric value before the word processes?

Assuming it would be a nested regex within the regex.

So basically, the map regex would only return 11, but enforce the rules
above?

Just trying to understand the logic behind this.

Thanks.



------------------------------

Message: 15
Date: Fri, 10 May 2013 23:11:42 +0200
From: Claudio Kuenzler <c...@claudiokuenzler.com>
Subject: Re: [Nagios-users] Trying to figure out the PCRE expression
        for Nagiosgraph Map
To: p...@psk.net, Nagios Users List
        <nagios-users@lists.sourceforge.net>
Message-ID:
        <caf-yqgiw6aj_w_qurdzrk-6ptcaw+_jfzvfkbau6sdcui9j...@mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

> The output for one query would be:
>
> PROCS OK: 11 processes with args 'apache'
>

Well first of all you'd have to make sure that nagiosgraph also takes the
output in account.
It's always better to do that with perfdata...

You have the choice to also take the output as source to parse, although I
strongly recommend to use perfdata. That's what it is for.


>
> What would the map rule look like that would do the following?
>
> 1. Begin with "PROCS OK:"
> 2. End with "args 'apache'"
> 3. Extract only the numeric value before the word processes?


The regex would look something like this:

/output:PROCS.*:(\d+) processes.*/

assuming that you don't care about the args and the status (OK, WARNING,
CRITICAL) part.
Only the digit (11) would be taken out of the output in this case.
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 16
Date: Sat, 11 May 2013 13:24:27 +0200
From: Jonas Meurer <jo...@freesources.org>
Subject: Re: [Nagios-users] servicegroup overview not restricted for
        htaccess users
To: nagios-users@lists.sourceforge.net
Message-ID: <518e2a6b.3090...@freesources.org>
Content-Type: text/plain; charset=ISO-8859-1

Hello,

Am 06.05.2013 10:42, schrieb Jonas Meurer:
> I fear that I discovered a security issue in Nagios 3.4.4 status.cgi:

no comments on that?

> All htaccess users, even if not listed in any authorized_for_* config
> option, have full access to service group overview, summary and grid:
> /nagios/cgi-bin/status.cgi?servicegroup=all&style=overview
> /nagios/cgi-bin/status.cgi?servicegroup=all&style=summary
> /nagios/cgi-bin/status.cgi?servicegroup=all&style=grid
>
> I hope that this is not intended. Is this issue known?
>
> Kind regards,
>   jonas
>
>
> ------------------------------------------------------------------------------
> Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
> Get 100% visibility into your production application - at no cost.
> Code-level diagnostics for performance bottlenecks with <2% overhead
> Download for free and get started troubleshooting in minutes.
> http://p.sf.net/sfu/appdyn_d2d_ap1
> _______________________________________________
> Nagios-users mailing list
> Nagios-users@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting 
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>




------------------------------

Message: 17
Date: Sat, 11 May 2013 07:30:36 -0400
From: Percy Kwong <p...@psk.net>
Subject: Re: [Nagios-users] Trying to figure out the PCRE expression
        for Nagiosgraph Map
To: Claudio Kuenzler <c...@claudiokuenzler.com>
Cc: Nagios Users List <nagios-users@lists.sourceforge.net>
Message-ID: <518e2bdc.2070...@psk.net>
Content-Type: text/plain; charset="iso-8859-1"

OK.  So to make more sense of the whole thing, the only thing that is
taken into account is the actual numerical value?  In other words, it's
automatically parsed?  This is what I wasn't sure of.

Here is the entry in the mapfile I was using:



I guess the reason I'm having issues with this is the following snippet
from the nagiosgraph.log:

Fri May 10 12:57:51 2013 insert.pl warn output/perfdata not recognized:
hostname:mymachine
servicedesc:Apache Processes
output:PROCS OK: 11 processes with args apache
perfdata:

the problem is there is no perfdata and the rrd file isn't being
populated, (and obviously, no graph).  I'm attributing this to the fact
that the map file entry is wrong.  This is really where my problem
lies.  Am I looking in the wrong place?

Thanks.





On 5/10/2013 5:11 PM, Claudio Kuenzler wrote:
>
>     The output for one query would be:
>
>     PROCS OK: 11 processes with args 'apache'
>
>
> Well first of all you'd have to make sure that nagiosgraph also takes
> the output in account.
> It's always better to do that with perfdata...
>
> You have the choice to also take the output as source to parse,
> although I strongly recommend to use perfdata. That's what it is for.
>
>
>     What would the map rule look like that would do the following?
>
>     1. Begin with "PROCS OK:"
>     2. End with "args 'apache'"
>     3. Extract only the numeric value before the word processes?
>
>
> The regex would look something like this:
>
> /output:PROCS.*:(\d+) processes.*/
>
> assuming that you don't care about the args and the status (OK,
> WARNING, CRITICAL) part.
> Only the digit (11) would be taken out of the output in this case.

-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and
their applications. This 200-page book is written by three acclaimed
leaders in the field. The early access version is available now.
Download your free book today! http://p.sf.net/sfu/neotech_d2d_may

------------------------------

_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users


End of Nagios-users Digest, Vol 84, Issue 3
*******************************************

------------------------------------------------------------------------------
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
_______________________________________________
Nagios-users mailing list
Nagios-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting 
any issue. 
::: Messages without supporting info will risk being sent to /dev/null

Re: [Nagios-users] High Availability

Reply via email to