RobH has submitted this change and it was merged.

Change subject: Add alert for elasticsearch 50th percentile prefix search time
......................................................................


Add alert for elasticsearch 50th percentile prefix search time

Typically prefix search is around 10-30ms. If it hits 75 or 150ms
there is almost certainly something wrong that should be addressed.

Requires adding discovery-ale...@lists.wikimedia.org to the private
puppet repo containing alerting email addresses (contacts.cfg)

Bug: T124542
Change-Id: I9c7b79f7af221c0d32ba1c6baa39c55f1bc92d8d
---
M manifests/site.pp
M modules/nagios_common/files/contactgroups.cfg
A modules/role/manifests/elasticsearch/alerts.pp
3 files changed, 17 insertions(+), 1 deletion(-)

Approvals:
  RobH: Looks good to me, approved
  jenkins-bot: Verified



diff --git a/manifests/site.pp b/manifests/site.pp
index 4c17989..532dec9 100644
--- a/manifests/site.pp
+++ b/manifests/site.pp
@@ -1170,7 +1170,7 @@
 
 # Primary graphite machines
 node 'graphite1001.eqiad.wmnet' {
-    role graphite::production, statsdlb, performance, graphite::alerts, 
restbase::alerts, graphite::alerts::reqstats
+    role graphite::production, statsdlb, performance, graphite::alerts, 
restbase::alerts, graphite::alerts::reqstats, elasticsearch::alerts
     include standard
 }
 
diff --git a/modules/nagios_common/files/contactgroups.cfg 
b/modules/nagios_common/files/contactgroups.cfg
index 3de6f46..8df55d9 100644
--- a/modules/nagios_common/files/contactgroups.cfg
+++ b/modules/nagios_common/files/contactgroups.cfg
@@ -64,3 +64,8 @@
     contactgroup_name   wdqs-admins
     members             smalyshev
 }
+
+define contactgroup {
+    contactgroup_name   team-discovery
+    members             discovery-alerts
+}
diff --git a/modules/role/manifests/elasticsearch/alerts.pp 
b/modules/role/manifests/elasticsearch/alerts.pp
new file mode 100644
index 0000000..0f86ea5
--- /dev/null
+++ b/modules/role/manifests/elasticsearch/alerts.pp
@@ -0,0 +1,11 @@
+class role::elasticsearch::alerts {
+    monitoring::graphite_threshold { 'prefix_search_50th_percentile':
+        description    => 'Prefix search 50th percentile latency',
+        metric         => 
'transformNull(MediaWiki.CirrusSearch.requestTimeMs.prefix.p50, 0)',
+        from           => '10min',
+        warning        => '75',
+        critical       => '150',
+        percentage     => '20',
+        contact_group  => 'team-discovery',
+    }
+}

-- 
To view, visit https://gerrit.wikimedia.org/r/265942
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9c7b79f7af221c0d32ba1c6baa39c55f1bc92d8d
Gerrit-PatchSet: 5
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: EBernhardson <ebernhard...@wikimedia.org>
Gerrit-Reviewer: Giuseppe Lavagetto <glavage...@wikimedia.org>
Gerrit-Reviewer: RobH <r...@wikimedia.org>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to