Ottomata has submitted this change and it was merged. ( 
https://gerrit.wikimedia.org/r/350888 )

Change subject: Update cron job copying mediawiki db into hdfs
......................................................................


Update cron job copying mediawiki db into hdfs

Current cron job uses new-month as snapshot name. This means
the dataset doesn't contain the data for the month it's named
after. By convention hadoop datasets are named after data that
is presen, and this patch correct that.

Bug: T163483
Change-Id: I600cc6822a6c784e7b837ae794538c6bcb29aaf1
---
M modules/role/manifests/analytics_cluster/refinery/job/sqoop_mediawiki.pp
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Ottomata: Verified; Looks good to me, approved



diff --git 
a/modules/role/manifests/analytics_cluster/refinery/job/sqoop_mediawiki.pp 
b/modules/role/manifests/analytics_cluster/refinery/job/sqoop_mediawiki.pp
index bdacc3d..29ca5c0 100644
--- a/modules/role/manifests/analytics_cluster/refinery/job/sqoop_mediawiki.pp
+++ b/modules/role/manifests/analytics_cluster/refinery/job/sqoop_mediawiki.pp
@@ -25,7 +25,7 @@
     $num_processors   = 3
 
     cron { 'refinery-sqoop-mediawiki':
-        command  => "${env} && /usr/bin/python3 
${role::analytics_cluster::refinery::path}/bin/sqoop-mediawiki-tables 
--job-name sqoop-mediawiki-monthly-$(/bin/date '+\\%Y-\\%m') --labs --jdbc-host 
${db_host} --output-dir ${$output_directory} --wiki-file  ${wiki_file} --user 
${db_user} --password-file ${db_password_file} --timestamp \$(/bin/date 
'+\\%Y\\%m01000000') --snapshot \$(/bin/date '+\\%Y-\\%m') -k ${num_processors} 
>> ${log_file} 2>&1",
+        command  => "${env} && /usr/bin/python3 
${role::analytics_cluster::refinery::path}/bin/sqoop-mediawiki-tables 
--job-name sqoop-mediawiki-monthly-$(/bin/date --date=\"$(/bin/date 
+\\%Y-\\%m-15) -1 month\" +'\\%Y-\\%m') --labs --jdbc-host ${db_host} 
--output-dir ${$output_directory} --wiki-file  ${wiki_file} --user ${db_user} 
--password-file ${db_password_file} --timestamp \$(/bin/date 
'+\\%Y\\%m01000000') --snapshot \$(/bin/date '+\\%Y-\\%m') -k ${num_processors} 
>> ${log_file} 2>&1",
         user     => 'hdfs',
         minute   => '0',
         hour     => '0',

-- 
To view, visit https://gerrit.wikimedia.org/r/350888
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I600cc6822a6c784e7b837ae794538c6bcb29aaf1
Gerrit-PatchSet: 2
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Joal <j...@wikimedia.org>
Gerrit-Reviewer: Ottomata <ao...@wikimedia.org>
Gerrit-Reviewer: jenkins-bot <>

_______________________________________________
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits

Reply via email to