Ottomata has submitted this change and it was merged. ( https://gerrit.wikimedia.org/r/350888 )
Change subject: Update cron job copying mediawiki db into hdfs ...................................................................... Update cron job copying mediawiki db into hdfs Current cron job uses new-month as snapshot name. This means the dataset doesn't contain the data for the month it's named after. By convention hadoop datasets are named after data that is presen, and this patch correct that. Bug: T163483 Change-Id: I600cc6822a6c784e7b837ae794538c6bcb29aaf1 --- M modules/role/manifests/analytics_cluster/refinery/job/sqoop_mediawiki.pp 1 file changed, 1 insertion(+), 1 deletion(-) Approvals: Ottomata: Verified; Looks good to me, approved diff --git a/modules/role/manifests/analytics_cluster/refinery/job/sqoop_mediawiki.pp b/modules/role/manifests/analytics_cluster/refinery/job/sqoop_mediawiki.pp index bdacc3d..29ca5c0 100644 --- a/modules/role/manifests/analytics_cluster/refinery/job/sqoop_mediawiki.pp +++ b/modules/role/manifests/analytics_cluster/refinery/job/sqoop_mediawiki.pp @@ -25,7 +25,7 @@ $num_processors = 3 cron { 'refinery-sqoop-mediawiki': - command => "${env} && /usr/bin/python3 ${role::analytics_cluster::refinery::path}/bin/sqoop-mediawiki-tables --job-name sqoop-mediawiki-monthly-$(/bin/date '+\\%Y-\\%m') --labs --jdbc-host ${db_host} --output-dir ${$output_directory} --wiki-file ${wiki_file} --user ${db_user} --password-file ${db_password_file} --timestamp \$(/bin/date '+\\%Y\\%m01000000') --snapshot \$(/bin/date '+\\%Y-\\%m') -k ${num_processors} >> ${log_file} 2>&1", + command => "${env} && /usr/bin/python3 ${role::analytics_cluster::refinery::path}/bin/sqoop-mediawiki-tables --job-name sqoop-mediawiki-monthly-$(/bin/date --date=\"$(/bin/date +\\%Y-\\%m-15) -1 month\" +'\\%Y-\\%m') --labs --jdbc-host ${db_host} --output-dir ${$output_directory} --wiki-file ${wiki_file} --user ${db_user} --password-file ${db_password_file} --timestamp \$(/bin/date '+\\%Y\\%m01000000') --snapshot \$(/bin/date '+\\%Y-\\%m') -k ${num_processors} >> ${log_file} 2>&1", user => 'hdfs', minute => '0', hour => '0', -- To view, visit https://gerrit.wikimedia.org/r/350888 To unsubscribe, visit https://gerrit.wikimedia.org/r/settings Gerrit-MessageType: merged Gerrit-Change-Id: I600cc6822a6c784e7b837ae794538c6bcb29aaf1 Gerrit-PatchSet: 2 Gerrit-Project: operations/puppet Gerrit-Branch: production Gerrit-Owner: Joal <j...@wikimedia.org> Gerrit-Reviewer: Ottomata <ao...@wikimedia.org> Gerrit-Reviewer: jenkins-bot <> _______________________________________________ MediaWiki-commits mailing list MediaWiki-commits@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits