[Gluster-infra] [Bug 1564372] Setup Nagios server

2020-03-12 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372

Worker Ant  changed:

   What|Removed |Added

 Status|ASSIGNED|CLOSED
 Resolution|--- |UPSTREAM
Last Closed|2018-06-20 18:25:33 |2020-03-12 13:03:25



--- Comment #15 from Worker Ant  ---
This bug is moved to
https://github.com/gluster/project-infrastructure/issues/41, and will be
tracked there from now on. Visit GitHub issues URL for further details

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2019-02-19 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372



--- Comment #12 from M. Scherer  ---
So, NRPE seems to be confined, notification got improved (text message are
better than before), and I am adding servers one by one.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-09-28 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372



--- Comment #11 from M. Scherer  ---
So, status (again for myself mostly)
- check for process stuck in Z state is done and working
- check for selinux is done, tested
- the munin notification should now clean themself
- check for specific process is done and working, tested on squid/ubunoun

Next step:
- verify again NRPE in details (like, is it confined by selinux properly, what
can a rogue client achieve)
- improve notification
- add more verification on various servers

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=WUmdJqYScx=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-09-28 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372



--- Comment #10 from M. Scherer  ---
So, I did deploy NRPE internally, and testing on the munin server. Right now,
it just test the load and for zombie process, but I have code for SElinux,
checking the rpm db and I think a architecture for adding more.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=mz0AKe1JtQ=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-09-27 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372



--- Comment #9 from M. Scherer  ---
Second step:
https://github.com/fedora-selinux/selinux-policy-contrib/pull/72


In the mean time, I will make munin run as unconfined server side until I can
work on a send_nsca policy.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=9KjUg0Dqi7=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-09-27 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372



--- Comment #8 from M. Scherer  ---
First step:

https://github.com/fedora-selinux/selinux-policy/pull/229

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=kGQChjpeNT=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-09-26 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372



--- Comment #7 from M. Scherer  ---
Now, blocked with:

type=AVC msg=audit(1537979718.243:116446): avc:  denied  { name_connect } for 
pid=27096 comm="send_nsca" dest=5667
scontext=system_u:system_r:munin_t:s0-s0:c0.c1023
tcontext=system_u:object_r:unreserved_port_t:s0 tclass=tcp_socket

Guess I might need to write my own policy.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=HvGKBrBvwM=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-09-26 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372



--- Comment #6 from M. Scherer  ---
So, munin -> nagios connection do work, but:

- hit some selinux issue:

type=AVC msg=audit(1537977117.718:115791): avc:  denied  { search } for 
pid=19206 comm="send_nsca" name="nagios" dev="dm-0" ino=271810
scontext=system_u:system_r:munin_t:s0-s0:c0.c1023
tcontext=system_u:object_r:nagios_etc_t:s0 tclass=dir


This one shouldn't be too hard to fix.

- have to understand how munin is supposed to be integrated. For example, I
see:

[1537976773] EXTERNAL COMMAND:
PROCESS_SERVICE_CHECK_RESULT;supercolony.gluster.org;Disk usage in
percent;1;WARNINGs: / is 93.80 (outside range [:92]).
[1537976773] Warning:  Passive check result was received for service 'Disk
usage in percent' on host 'supercolony.gluster.org', but the service could not
be found!

- see why supercolony do alert, but not the builder at 100% cpu I set up

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=qzhrh3C45Z=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-09-26 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372



--- Comment #5 from M. Scherer  ---
So:

All servers managed by ansible are now monitored for ping/ssh (which did permit
to see that our freebsd hosts blocked ping, because i got paged for that as
soon as I deployed). Aka, all but gerrit prod.


I have added smtp port on supercolony, and vhost checking for a couple of web
site, see ansible repo for details. 

For now, and while I do clean the roles and stuff, I am the only one receiving
alerts, but we will need a plan for the future, I did discuss with nigel on
irc.

Notes for myself (and people that care), here the list of things to do:
- investigate more nrpe (like, security impact on having it opened on the nated
IP of the cage)
- add munin/nagios connexion
- add check of process:
   - cron
   - custom process

- add custom check (gerrit, jenkins server being offline, etc)

- refine httpd check (like more than "http 200")

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=OE9iSEFxEC=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-09-10 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372



--- Comment #4 from M. Scherer  ---
So, I reused the existing role I had, and setup a nagios server. 

Now, I need to:
- move munin internally (server is installed, i need to clean the role, move
the data)
- connect munin/nagios
- add more check to nagios (the hard part, do that without repeating data all
over the place)
- add more servers

So far, it worked, cause I got paged for a IP v6 problem in the cage (cause
there is no ipv6 in the cage in the first place...)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=mCVFilUU2H=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-06-20 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372

Shyamsundar  changed:

   What|Removed |Added

 CC||srang...@redhat.com
Version|4.0 |mainline



-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=VU67I5SFoC=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-06-20 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372

M. Scherer  changed:

   What|Removed |Added

 Status|CLOSED  |NEW
 Resolution|EOL |---
   Keywords||Reopened



-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=QDxeDF9K2l=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-06-20 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372

Shyamsundar  changed:

   What|Removed |Added

 Status|NEW |CLOSED
 Resolution|--- |EOL
Last Closed||2018-06-20 14:25:33



--- Comment #3 from Shyamsundar  ---
This bug reported is against a version of Gluster that is no longer maintained
(or has been EOL'd). See https://www.gluster.org/release-schedule/ for the
versions currently maintained.

As a result this bug is being closed.

If the bug persists on a maintained version of gluster or against the mainline
gluster repository, request that it be reopened and the Version field be marked
appropriately.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=aJyE8bIo0a=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra


[Gluster-infra] [Bug 1564372] Setup Nagios server

2018-04-09 Thread bugzilla
https://bugzilla.redhat.com/show_bug.cgi?id=1564372



--- Comment #2 from Nigel Babu  ---
I'd say alert to a list like ale...@gluster.org. We'll still do best effort
working day coverage. This only enhances our ability to see what fails sooner
than someone else noticing the failure.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
Unsubscribe from this bug 
https://bugzilla.redhat.com/token.cgi?t=kTjItRT0Dm=cc_unsubscribe
___
Gluster-infra mailing list
Gluster-infra@gluster.org
http://lists.gluster.org/mailman/listinfo/gluster-infra