[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 Worker Ant changed: What|Removed |Added Status|ASSIGNED|CLOSED Resolution|--- |UPSTREAM Last Closed|2018-06-20 18:25:33 |2020-03-12 13:03:25 --- Comment #15 from Worker Ant --- This bug is moved to https://github.com/gluster/project-infrastructure/issues/41, and will be tracked there from now on. Visit GitHub issues URL for further details -- You are receiving this mail because: You are on the CC list for the bug. ___ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 --- Comment #12 from M. Scherer --- So, NRPE seems to be confined, notification got improved (text message are better than before), and I am adding servers one by one. -- You are receiving this mail because: You are on the CC list for the bug. ___ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 --- Comment #11 from M. Scherer --- So, status (again for myself mostly) - check for process stuck in Z state is done and working - check for selinux is done, tested - the munin notification should now clean themself - check for specific process is done and working, tested on squid/ubunoun Next step: - verify again NRPE in details (like, is it confined by selinux properly, what can a rogue client achieve) - improve notification - add more verification on various servers -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=WUmdJqYScx=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 --- Comment #10 from M. Scherer --- So, I did deploy NRPE internally, and testing on the munin server. Right now, it just test the load and for zombie process, but I have code for SElinux, checking the rpm db and I think a architecture for adding more. -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=mz0AKe1JtQ=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 --- Comment #9 from M. Scherer --- Second step: https://github.com/fedora-selinux/selinux-policy-contrib/pull/72 In the mean time, I will make munin run as unconfined server side until I can work on a send_nsca policy. -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=9KjUg0Dqi7=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 --- Comment #8 from M. Scherer --- First step: https://github.com/fedora-selinux/selinux-policy/pull/229 -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=kGQChjpeNT=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 --- Comment #7 from M. Scherer --- Now, blocked with: type=AVC msg=audit(1537979718.243:116446): avc: denied { name_connect } for pid=27096 comm="send_nsca" dest=5667 scontext=system_u:system_r:munin_t:s0-s0:c0.c1023 tcontext=system_u:object_r:unreserved_port_t:s0 tclass=tcp_socket Guess I might need to write my own policy. -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=HvGKBrBvwM=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 --- Comment #6 from M. Scherer --- So, munin -> nagios connection do work, but: - hit some selinux issue: type=AVC msg=audit(1537977117.718:115791): avc: denied { search } for pid=19206 comm="send_nsca" name="nagios" dev="dm-0" ino=271810 scontext=system_u:system_r:munin_t:s0-s0:c0.c1023 tcontext=system_u:object_r:nagios_etc_t:s0 tclass=dir This one shouldn't be too hard to fix. - have to understand how munin is supposed to be integrated. For example, I see: [1537976773] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;supercolony.gluster.org;Disk usage in percent;1;WARNINGs: / is 93.80 (outside range [:92]). [1537976773] Warning: Passive check result was received for service 'Disk usage in percent' on host 'supercolony.gluster.org', but the service could not be found! - see why supercolony do alert, but not the builder at 100% cpu I set up -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=qzhrh3C45Z=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 --- Comment #5 from M. Scherer --- So: All servers managed by ansible are now monitored for ping/ssh (which did permit to see that our freebsd hosts blocked ping, because i got paged for that as soon as I deployed). Aka, all but gerrit prod. I have added smtp port on supercolony, and vhost checking for a couple of web site, see ansible repo for details. For now, and while I do clean the roles and stuff, I am the only one receiving alerts, but we will need a plan for the future, I did discuss with nigel on irc. Notes for myself (and people that care), here the list of things to do: - investigate more nrpe (like, security impact on having it opened on the nated IP of the cage) - add munin/nagios connexion - add check of process: - cron - custom process - add custom check (gerrit, jenkins server being offline, etc) - refine httpd check (like more than "http 200") -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=OE9iSEFxEC=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 --- Comment #4 from M. Scherer --- So, I reused the existing role I had, and setup a nagios server. Now, I need to: - move munin internally (server is installed, i need to clean the role, move the data) - connect munin/nagios - add more check to nagios (the hard part, do that without repeating data all over the place) - add more servers So far, it worked, cause I got paged for a IP v6 problem in the cage (cause there is no ipv6 in the cage in the first place...) -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=mCVFilUU2H=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 Shyamsundar changed: What|Removed |Added CC||srang...@redhat.com Version|4.0 |mainline -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=VU67I5SFoC=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 M. Scherer changed: What|Removed |Added Status|CLOSED |NEW Resolution|EOL |--- Keywords||Reopened -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=QDxeDF9K2l=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 Shyamsundar changed: What|Removed |Added Status|NEW |CLOSED Resolution|--- |EOL Last Closed||2018-06-20 14:25:33 --- Comment #3 from Shyamsundar --- This bug reported is against a version of Gluster that is no longer maintained (or has been EOL'd). See https://www.gluster.org/release-schedule/ for the versions currently maintained. As a result this bug is being closed. If the bug persists on a maintained version of gluster or against the mainline gluster repository, request that it be reopened and the Version field be marked appropriately. -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=aJyE8bIo0a=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra
[Gluster-infra] [Bug 1564372] Setup Nagios server
https://bugzilla.redhat.com/show_bug.cgi?id=1564372 --- Comment #2 from Nigel Babu--- I'd say alert to a list like ale...@gluster.org. We'll still do best effort working day coverage. This only enhances our ability to see what fails sooner than someone else noticing the failure. -- You are receiving this mail because: You are on the CC list for the bug. Unsubscribe from this bug https://bugzilla.redhat.com/token.cgi?t=kTjItRT0Dm=cc_unsubscribe ___ Gluster-infra mailing list Gluster-infra@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-infra