Giuseppe Lavagetto has uploaded a new change for review. https://gerrit.wikimedia.org/r/161450
Change subject: HAT: depool faulty servers, monitor hhvm rendering ...................................................................... HAT: depool faulty servers, monitor hhvm rendering Given we have had a few issues with hhvm crashing badly, we do some mitigations: * depool servers until at least 30% of the cluster is alive * monitor an url served by the PHP engine, and not directly by apache Change-Id: I77cd3510a5d22a96ff4602eca7b5b539a2a36ee1 Signed-off-by: Giuseppe Lavagetto <glavage...@wikimedia.org> --- M manifests/role/mediawiki.pp M modules/lvs/manifests/configuration.pp M templates/icinga/checkcommands.cfg.erb 3 files changed, 14 insertions(+), 1 deletion(-) git pull ssh://gerrit.wikimedia.org:29418/operations/puppet refs/changes/50/161450/1 diff --git a/manifests/role/mediawiki.pp b/manifests/role/mediawiki.pp index 218aa5f..2d70dae 100644 --- a/manifests/role/mediawiki.pp +++ b/manifests/role/mediawiki.pp @@ -75,7 +75,13 @@ system::role { 'role::mediawiki::appserver': } if ubuntu_version('>= trusty') { + $pool = 'hhvm_appservers' + + monitor_service { 'appserver_http_hhvm': + description => 'HHVM rendering', + check_command => 'check_http_wikipedia_main', + } } else { $pool = 'apaches' } diff --git a/modules/lvs/manifests/configuration.pp b/modules/lvs/manifests/configuration.pp index b3bd2e9..3757228 100644 --- a/modules/lvs/manifests/configuration.pp +++ b/modules/lvs/manifests/configuration.pp @@ -616,7 +616,7 @@ 'sites' => [ "eqiad" ], 'ip' => $service_ips['hhvm_appservers'][$::site], 'bgp' => "yes", - 'depool-threshold' => ".9", + 'depool-threshold' => ".3", 'monitors' => { 'ProxyFetch' => { 'url' => [ 'http://en.wikipedia.org/wiki/Main_Page' ], diff --git a/templates/icinga/checkcommands.cfg.erb b/templates/icinga/checkcommands.cfg.erb index 869e36c..575ff15 100644 --- a/templates/icinga/checkcommands.cfg.erb +++ b/templates/icinga/checkcommands.cfg.erb @@ -230,6 +230,13 @@ command_line $USER1$/check_http -H en.wikipedia.org -I $HOSTADDRESS$ -u / } +# 'check_http_wikipedia' command definition, querying the main page +define command{ + command_name check_http_wikipedia_main + command_line $USER1$/check_http -H en.wikipedia.org -I $HOSTADDRESS$ -u /wiki/Main_Page + } + + # 'check_http_upload' command definition, querying a different URL define command{ command_name check_http_upload -- To view, visit https://gerrit.wikimedia.org/r/161450 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I77cd3510a5d22a96ff4602eca7b5b539a2a36ee1 Gerrit-PatchSet: 1 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Giuseppe Lavagetto <glavage...@wikimedia.org> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits