Elukey has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/340719 )

Change subject: Add experimental JVM Heap usage alarm to Zookeeper prod 
instances
......................................................................


Add experimental JVM Heap usage alarm to Zookeeper prod instances

This alarm will only alert Analytics in case a production Zookeeper
cluster will get high heap usage. The actual format is not really
generic and it assumes that all the Zookeeper production instances
are owned by Analytics, but it is a good short/medium compromise to
keep a vital servive monitored (Kafka and Hadoop are not working
without it).

Bug: T157968
Change-Id: I20bfc67a40626e078d71af0892b0a390e53344e3
---
M modules/role/manifests/zookeeper/server.pp
1 file changed, 13 insertions(+), 0 deletions(-)

Approvals:
  Elukey: Looks good to me, approved
  jenkins-bot: Verified



diff --git a/modules/role/manifests/zookeeper/server.pp 
b/modules/role/manifests/zookeeper/server.pp
index 6544fec..0532705 100644
--- a/modules/role/manifests/zookeeper/server.pp
+++ b/modules/role/manifests/zookeeper/server.pp
@@ -76,5 +76,18 @@
             # Critical if we go over 90% of max
             critical    => $::zookeeper::max_client_connections * 0.9,
         }
+
+        # Experimental Analytics alarms on JVM usage
+        # These alarms are not really generic and the thresholds are based
+        # on a fixed Max Heap size of 1G.
+        monitoring::graphite_threshold { 'zookeeper-server-heap-usage':
+            description   => 'Zookeeper node JVM Heap usage',
+            metric        => 
"${group_prefix}.jvm_memory.${::hostname}_eqiad_wmnet_${::zookeeper::jmxtrans::jmx_port}.memory.HeapMemoryUsage_used.upper",
+            from          => '60min',
+            warning       => '921',  # 90% of the Heap used
+            critical      => '972',  # 95% of the Heap used
+            percentage    => '60',
+            contact_group => 'analytics',
+        }
     }
 }

-- 
To view, visit https://gerrit.wikimedia.org/r/340719
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I20bfc67a40626e078d71af0892b0a390e53344e3
Gerrit-PatchSet: 3
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Elukey <ltosc...@wikimedia.org>
Gerrit-Reviewer: Elukey <ltosc...@wikimedia.org>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to