Giuseppe Lavagetto has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/161450

Change subject: HAT: depool faulty servers, monitor hhvm rendering
......................................................................

HAT: depool faulty servers, monitor hhvm rendering

Given we have had a few issues with hhvm crashing badly, we do some
mitigations:

* depool servers until at least 30% of the cluster is alive
* monitor an url served by the PHP engine, and not directly by apache

Change-Id: I77cd3510a5d22a96ff4602eca7b5b539a2a36ee1
Signed-off-by: Giuseppe Lavagetto <glavage...@wikimedia.org>
---
M manifests/role/mediawiki.pp
M modules/lvs/manifests/configuration.pp
M templates/icinga/checkcommands.cfg.erb
3 files changed, 14 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.wikimedia.org:29418/operations/puppet 
refs/changes/50/161450/1

diff --git a/manifests/role/mediawiki.pp b/manifests/role/mediawiki.pp
index 218aa5f..2d70dae 100644
--- a/manifests/role/mediawiki.pp
+++ b/manifests/role/mediawiki.pp
@@ -75,7 +75,13 @@
     system::role { 'role::mediawiki::appserver': }
 
     if ubuntu_version('>= trusty') {
+
         $pool = 'hhvm_appservers'
+
+        monitor_service { 'appserver_http_hhvm':
+            description   => 'HHVM rendering',
+            check_command => 'check_http_wikipedia_main',
+        }
     } else {
         $pool = 'apaches'
     }
diff --git a/modules/lvs/manifests/configuration.pp 
b/modules/lvs/manifests/configuration.pp
index b3bd2e9..3757228 100644
--- a/modules/lvs/manifests/configuration.pp
+++ b/modules/lvs/manifests/configuration.pp
@@ -616,7 +616,7 @@
             'sites' => [ "eqiad" ],
             'ip' => $service_ips['hhvm_appservers'][$::site],
             'bgp' => "yes",
-            'depool-threshold' => ".9",
+            'depool-threshold' => ".3",
             'monitors' => {
                 'ProxyFetch' => {
                     'url' => [ 'http://en.wikipedia.org/wiki/Main_Page' ],
diff --git a/templates/icinga/checkcommands.cfg.erb 
b/templates/icinga/checkcommands.cfg.erb
index 869e36c..575ff15 100644
--- a/templates/icinga/checkcommands.cfg.erb
+++ b/templates/icinga/checkcommands.cfg.erb
@@ -230,6 +230,13 @@
         command_line    $USER1$/check_http -H en.wikipedia.org -I 
$HOSTADDRESS$ -u /
         }
 
+# 'check_http_wikipedia' command definition, querying the main page
+define command{
+        command_name    check_http_wikipedia_main
+        command_line    $USER1$/check_http -H en.wikipedia.org -I 
$HOSTADDRESS$ -u /wiki/Main_Page
+        }
+
+
 # 'check_http_upload' command definition, querying a different URL
 define command{
         command_name    check_http_upload

-- 
To view, visit https://gerrit.wikimedia.org/r/161450
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I77cd3510a5d22a96ff4602eca7b5b539a2a36ee1
Gerrit-PatchSet: 1
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Giuseppe Lavagetto <glavage...@wikimedia.org>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to