Akosiaris has submitted this change and it was merged. Change subject: Change the way cron run times are calculated ......................................................................
Change the way cron run times are calculated Make the puppet run interval more configurable by allowing to modify it by changing a single value instead of a value + a cron entry. The ERB calculates the cron entry now using the interval and crontimes values. An assumption is made that the maximum interval is 60 minutes and the minimum 1 minute. Forthermore enable splay for puppet runs allowing hosts to delay up to 45s their runs. This allows for a better distribution of puppet runs in a minute resulting in a more balanced load on the puppetmasters Finally make the freshness interval dependable on the interval (albeit somewhat hardcoded) and have icinga complain about freshness checks stating the time the service is a failed state avoiding the need to change the freshness check interval in multiple places Change-Id: Ie44b37f45a2962954927e988947315e8d9459400 --- M modules/base/manifests/init.pp M modules/base/templates/puppet.cron.erb M templates/icinga/checkcommands.cfg.erb 3 files changed, 26 insertions(+), 8 deletions(-) Approvals: Akosiaris: Looks good to me, approved jenkins-bot: Verified diff --git a/modules/base/manifests/init.pp b/modules/base/manifests/init.pp index 45e60f0..d2a11af 100644 --- a/modules/base/manifests/init.pp +++ b/modules/base/manifests/init.pp @@ -66,6 +66,14 @@ include passwords::puppet::database + ## run puppet by cron and + ## rotate puppet logs generated by cron + ## This is in mins. Do not set this to 0 or > 60 + $interval = 30 + $crontime = fqdn_rand(60) + # Calculate freshness interval in seconds (hence *60) + $freshnessinterval = $interval * 60 * 6 + package { [ 'puppet', 'facter', 'coreutils' ]: ensure => latest; } @@ -97,7 +105,13 @@ require => [ Package['snmp'], File['/etc/snmp'] ]; } - monitor_service { 'puppet freshness': description => 'Puppet freshness', check_command => 'puppet-FAIL', passive => 'true', freshness => 10800, retries => 1 ; } + monitor_service { 'puppet freshness': + description => 'Puppet freshness', + check_command => 'puppet-FAIL', + passive => 'true', + freshness => $freshnessinterval, + retries => 1, + } case $::realm { 'production': { @@ -188,10 +202,6 @@ enable => false, ensure => stopped; } - - ## run puppet by cron and - ## rotate puppet logs generated by cron - $crontime = fqdn_rand(30) file { "/etc/cron.d/puppet": diff --git a/modules/base/templates/puppet.cron.erb b/modules/base/templates/puppet.cron.erb index 4450ed0..a568f55 100644 --- a/modules/base/templates/puppet.cron.erb +++ b/modules/base/templates/puppet.cron.erb @@ -2,6 +2,14 @@ ##### THIS FILE IS MANAGED BY PUPPET ##### as template('base/puppet.cron.erb') ###################################################################### -<% $crontime = scope.lookupvar('base::puppet::crontime') -%> -<%= $crontime %>,<%= $crontime.to_i + 30 %> * * * * root timeout <% if scope.function_versioncmp([lsbdistrelease, "12.04"]) >= 0 %> -k 300<% end %> 1800 puppet agent --onetime --verbose --no-daemonize --no-splay --show_diff >> /var/log/puppet.log 2>&1 +<%- +interval = scope.lookupvar('base::puppet::interval') +crontime = scope.lookupvar('base::puppet::crontime') +numtimes = 60 / interval +tmp = Array.new(numtimes) { |t| t = t * interval + crontime } +tmp = tmp.map { |x| if x < 60 then x else x - 60 end } +tmp = tmp.sort() +times = tmp.join(',') +-%> +<%= times %> * * * * root timeout <% if scope.function_versioncmp([lsbdistrelease, "12.04"]) >= 0 %> -k 300<% end %> 1800 puppet agent --onetime --verbose --no-daemonize --splay --splaylimit 45 --show_diff >> /var/log/puppet.log 2>&1 @reboot root timeout <% if scope.function_versioncmp([lsbdistrelease, "12.04"]) >= 0 %> -k 300<% end %> 1800 puppet agent --onetime --verbose --no-daemonize --no-splay --show_diff >> /var/log/puppet.log 2>&1 diff --git a/templates/icinga/checkcommands.cfg.erb b/templates/icinga/checkcommands.cfg.erb index 9a26b32..4778154 100644 --- a/templates/icinga/checkcommands.cfg.erb +++ b/templates/icinga/checkcommands.cfg.erb @@ -466,7 +466,7 @@ define command{ command_name puppet-FAIL - command_line echo "No successful Puppet run in the last 3 hours" && exit 2 + command_line echo "No successful Puppet run for $SERVICEDURATION$" && exit 2 } define command{ -- To view, visit https://gerrit.wikimedia.org/r/96247 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ie44b37f45a2962954927e988947315e8d9459400 Gerrit-PatchSet: 4 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Akosiaris <akosia...@wikimedia.org> Gerrit-Reviewer: Akosiaris <akosia...@wikimedia.org> Gerrit-Reviewer: Hashar <has...@free.fr> Gerrit-Reviewer: jenkins-bot _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits