RobH has uploaded a new change for review. https://gerrit.wikimedia.org/r/267052
Change subject: Add alert for elasticsearch 50th percentile prefix search time ...................................................................... Add alert for elasticsearch 50th percentile prefix search time Typically prefix search is around 10-30ms. If it hits 75 or 150ms there is almost certainly something wrong that should be addressed. Requires adding discovery-ale...@lists.wikimedia.org to the private puppet repo containing alerting email addresses (contacts.cfg) Bug: T124542 This reverts commit f416ea1f678d48004dc205c9c3611d992e7a7207. Change-Id: I3cc858db262edb88eed5c0ea409e974565a52707 --- M manifests/site.pp M modules/nagios_common/files/contactgroups.cfg A modules/role/manifests/elasticsearch/alerts.pp 3 files changed, 17 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/52/267052/1 diff --git a/manifests/site.pp b/manifests/site.pp index 4c17989..532dec9 100644 --- a/manifests/site.pp +++ b/manifests/site.pp @@ -1170,7 +1170,7 @@ # Primary graphite machines node 'graphite1001.eqiad.wmnet' { - role graphite::production, statsdlb, performance, graphite::alerts, restbase::alerts, graphite::alerts::reqstats + role graphite::production, statsdlb, performance, graphite::alerts, restbase::alerts, graphite::alerts::reqstats, elasticsearch::alerts include standard } diff --git a/modules/nagios_common/files/contactgroups.cfg b/modules/nagios_common/files/contactgroups.cfg index 3de6f46..8df55d9 100644 --- a/modules/nagios_common/files/contactgroups.cfg +++ b/modules/nagios_common/files/contactgroups.cfg @@ -64,3 +64,8 @@ contactgroup_name wdqs-admins members smalyshev } + +define contactgroup { + contactgroup_name team-discovery + members discovery-alerts +} diff --git a/modules/role/manifests/elasticsearch/alerts.pp b/modules/role/manifests/elasticsearch/alerts.pp new file mode 100644 index 0000000..0f86ea5 --- /dev/null +++ b/modules/role/manifests/elasticsearch/alerts.pp @@ -0,0 +1,11 @@ +class role::elasticsearch::alerts { + monitoring::graphite_threshold { 'prefix_search_50th_percentile': + description => 'Prefix search 50th percentile latency', + metric => 'transformNull(MediaWiki.CirrusSearch.requestTimeMs.prefix.p50, 0)', + from => '10min', + warning => '75', + critical => '150', + percentage => '20', + contact_group => 'team-discovery', + } +} -- To view, visit https://gerrit.wikimedia.org/r/267052 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I3cc858db262edb88eed5c0ea409e974565a52707 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: RobH <r...@wikimedia.org> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits