Filippo Giunchedi has submitted this change and it was merged.

Change subject: logging: update CirrusSearch thresholds
......................................................................


logging: update CirrusSearch thresholds

adjust thresholds to improve SNR

metric history is also available at
https://ganglia.wikimedia.org/latest/graph_all_periods.php?c=Miscellaneous%20eqiad&h=fluorine.eqiad.wmnet&r=hour&z=default&jr=&js=&st=1429622767&v=0.0167224080268&m=CirrusSearch-slow.log_line_rate&vl=lines%20per%20sec&ti=&z=large

Bug: T84163
Change-Id: I71c527168aa18ffd44e386fa59e159612944cb20
---
M manifests/role/logging.pp
1 file changed, 7 insertions(+), 11 deletions(-)

Approvals:
  Filippo Giunchedi: Verified; Looks good to me, approved
  Manybubbles: Looks good to me, but someone else must approve



diff --git a/manifests/role/logging.pp b/manifests/role/logging.pp
index b8e2891..246c4bb 100644
--- a/manifests/role/logging.pp
+++ b/manifests/role/logging.pp
@@ -111,23 +111,19 @@
         logster_options => '--output ganglia --metric-prefix 
CirrusSearch-slow.log',
         minute          => "*/${cirrussearch_slow_log_check_interval}"
     }
-    # Alert if CirrusSearch-slow.log shows more than
-    # 10 slow searches within an hour.  The logster
-    # job runs every $cirrussearch_slow_log_check_interval
+    # The logster job runs every $cirrussearch_slow_log_check_interval
     # minutes.  We set retries to
     # 60 minutes / cirrussearch_slow_log_check_interval minutes)
-    # This should keep icinga from alerting
-    # us unless the alert thresholds are exceeded
-    # for more than an hour.
+    # This should keep icinga from alerting us unless the alert thresholds are
+    # exceeded for more than an hour.
     monitoring::ganglia { 'CirrusSearch-slow-queries':
         description           => 'Slow CirrusSearch query rate',
         # this metric is output to ganglia by logster
         metric                => 'CirrusSearch-slow.log_line_rate',
-        # line_rate metric is per second, so we need to alert if this
-        # metric goes over 0.000046296 / second.  Let's round
-        # down to warning on 0.00004, or critical on 0.00008.
-        warning               => '0.00004',
-        critical              => '0.00008',
+        # warning  ->  36 queries/h
+        # critical -> 360 queries/h
+        warning               => '0.01',
+        critical              => '0.1',
         normal_check_interval => $cirrussearch_slow_log_check_interval,
         retry_check_interval  => $cirrussearch_slow_log_check_interval,
         retries               => (60/$cirrussearch_slow_log_check_interval),

-- 
To view, visit https://gerrit.wikimedia.org/r/205603
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I71c527168aa18ffd44e386fa59e159612944cb20
Gerrit-PatchSet: 2
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Filippo Giunchedi <fgiunch...@wikimedia.org>
Gerrit-Reviewer: Chad <ch...@wikimedia.org>
Gerrit-Reviewer: Faidon Liambotis <fai...@wikimedia.org>
Gerrit-Reviewer: Filippo Giunchedi <fgiunch...@wikimedia.org>
Gerrit-Reviewer: Manybubbles <never...@wikimedia.org>
Gerrit-Reviewer: Ottomata <o...@wikimedia.org>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to