Elukey has submitted this change and it was merged. Change subject: Raise nagios retry_interval to avoid false alarms for HHVM restarts ......................................................................
Raise nagios retry_interval to avoid false alarms for HHVM restarts In T147773 a regular HHVM restart workflow was introduced. It may happen that the two retries for check_http_wikipedia_main are quicker to complete than the restart, so we get false alarms that can confuse people watching icinga/IRC. Bug: T147773 Change-Id: I69168e1e9babc3f4805b715195ed065c052ed7de --- M modules/role/manifests/mediawiki/webserver.pp 1 file changed, 10 insertions(+), 6 deletions(-) Approvals: Elukey: Looks good to me, approved Giuseppe Lavagetto: Looks good to me, but someone else must approve jenkins-bot: Verified diff --git a/modules/role/manifests/mediawiki/webserver.pp b/modules/role/manifests/mediawiki/webserver.pp index 78898b8..857d4e6 100644 --- a/modules/role/manifests/mediawiki/webserver.pp +++ b/modules/role/manifests/mediawiki/webserver.pp @@ -47,16 +47,20 @@ # If a service check happens to run while we are performing a # graceful restart of Apache, we want to try again before declaring # defeat. See T103008. + # We want to avoid false alarms during scheduled HHVM restarts (T147773), + # so a higher retry_interval is needed. monitoring::service { 'appserver http': - description => 'Apache HTTP', - check_command => 'check_http_wikipedia', - retries => 2, + description => 'Apache HTTP', + check_command => 'check_http_wikipedia', + retries => 2, + retry_interval => 2, } monitoring::service { 'appserver_http_hhvm': - description => 'HHVM rendering', - check_command => 'check_http_wikipedia_main', - retries => 2, + description => 'HHVM rendering', + check_command => 'check_http_wikipedia_main', + retries => 2, + retry_interval => 2, } nrpe::monitor_service { 'hhvm': -- To view, visit https://gerrit.wikimedia.org/r/320361 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I69168e1e9babc3f4805b715195ed065c052ed7de Gerrit-PatchSet: 4 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Elukey <ltosc...@wikimedia.org> Gerrit-Reviewer: Alexandros Kosiaris <akosia...@wikimedia.org> Gerrit-Reviewer: Elukey <ltosc...@wikimedia.org> Gerrit-Reviewer: Giuseppe Lavagetto <glavage...@wikimedia.org> Gerrit-Reviewer: jenkins-bot <> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits