Elukey has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/340719 )
Change subject: Add experimental JVM Heap usage alarm to Zookeeper prod instances ...................................................................... Add experimental JVM Heap usage alarm to Zookeeper prod instances This alarm will only alert Analytics in case a production Zookeeper cluster will get high heap usage. The actual format is not really generic and it assumes that all the Zookeeper production instances are owned by Analytics, but it is a good short/medium compromise to keep a vital servive monitored (Kafka and Hadoop are not working without it). Bug: T157968 Change-Id: I20bfc67a40626e078d71af0892b0a390e53344e3 --- M modules/role/manifests/zookeeper/server.pp 1 file changed, 13 insertions(+), 0 deletions(-) Approvals: Elukey: Looks good to me, approved jenkins-bot: Verified diff --git a/modules/role/manifests/zookeeper/server.pp b/modules/role/manifests/zookeeper/server.pp index 6544fec..0532705 100644 --- a/modules/role/manifests/zookeeper/server.pp +++ b/modules/role/manifests/zookeeper/server.pp @@ -76,5 +76,18 @@ # Critical if we go over 90% of max critical => $::zookeeper::max_client_connections * 0.9, } + + # Experimental Analytics alarms on JVM usage + # These alarms are not really generic and the thresholds are based + # on a fixed Max Heap size of 1G. + monitoring::graphite_threshold { 'zookeeper-server-heap-usage': + description => 'Zookeeper node JVM Heap usage', + metric => "${group_prefix}.jvm_memory.${::hostname}_eqiad_wmnet_${::zookeeper::jmxtrans::jmx_port}.memory.HeapMemoryUsage_used.upper", + from => '60min', + warning => '921', # 90% of the Heap used + critical => '972', # 95% of the Heap used + percentage => '60', + contact_group => 'analytics', + } } } -- To view, visit https://gerrit.wikimedia.org/r/340719 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I20bfc67a40626e078d71af0892b0a390e53344e3 Gerrit-PatchSet: 3 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Elukey <ltosc...@wikimedia.org> Gerrit-Reviewer: Elukey <ltosc...@wikimedia.org> Gerrit-Reviewer: jenkins-bot <> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits