BryanDavis has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/231704

Change subject: Add icinga alert for anomalous 
logstash.rate.mediawiki.memcached.ERROR.count
......................................................................

Add icinga alert for anomalous logstash.rate.mediawiki.memcached.ERROR.count

Setup an icinga alert to complain if more than 15% of the last 100
memcached errors per minute counts from MediaWiki are above the
Holt-Winters predictive model rate.

Bug: T100735
Change-Id: I011f583b51862e54e653e9a900635c9ddf1e06e6
---
M manifests/role/graphite.pp
1 file changed, 16 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/operations/puppet 
refs/changes/04/231704/1

diff --git a/manifests/role/graphite.pp b/manifests/role/graphite.pp
index dd9ebd2..a07c063 100644
--- a/manifests/role/graphite.pp
+++ b/manifests/role/graphite.pp
@@ -277,6 +277,22 @@
         under        => true,
         group        => 'analytics_eqiad',
     }
+
+    # Monitor memcached error rate from MediaWiki. This is commonly a sign of
+    # a failing nutcracker instance that can be tracked down via
+    # https://logstash.wikimedia.org/#/dashboard/elasticsearch/memcached
+    monitoring::graphite_anomaly { 'mediawiki-memcached-anomoly':
+        description  => 'MediaWiki memcached error rate',
+        metric       => 'logstash.rate.mediawiki.memcached.ERROR.count',
+        # Check over the last 100 samples and:
+        # - alert warn if more than 5% are over the confidence band
+        # - alert critical if more than 15% are over the confidence band
+        check_window => 100,
+        warning      => 5,
+        critical     => 15,
+        over         => true,
+    }
+
 }
 
 # == Class: role::graphite::labmon

-- 
To view, visit https://gerrit.wikimedia.org/r/231704
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I011f583b51862e54e653e9a900635c9ddf1e06e6
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: BryanDavis <[email protected]>

_______________________________________________
MediaWiki-commits mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to