[MediaWiki-commits] [Gerrit] dumps: clean up construction of list of possible dump jobs f... - change (operations/dumps)

2015-11-23 Thread ArielGlenn (Code Review)
ArielGlenn has submitted this change and it was merged.

Change subject: dumps: clean up construction of list of possible dump jobs for 
wiki
..


dumps: clean up construction of list of possible dump jobs for wiki

move checks for exceptions for jobs to add, out to a single method
this is not the list of jobs that will necessarily be run but
the list of all jobs that would be run for a full dump of the wiki
instead of having all those checks inline in the code

Change-Id: I1a9a61b3ea654b0e4ff80cd00bd843f9e4b554cf
---
M xmldumps-backup/dumps/runner.py
1 file changed, 99 insertions(+), 102 deletions(-)

Approvals:
  ArielGlenn: Verified; Looks good to me, approved
  jenkins-bot: Verified



diff --git a/xmldumps-backup/dumps/runner.py b/xmldumps-backup/dumps/runner.py
index 7e52e0b..c4ecdee 100644
--- a/xmldumps-backup/dumps/runner.py
+++ b/xmldumps-backup/dumps/runner.py
@@ -111,21 +111,15 @@
 "Data for blocks of IP addresses, 
ranges, and users."),
PrivateTable("archive", "archivetable",
 "Deleted page and revision data."),
-   # PrivateTable("updates", "updatestable",
-   #  "Update dataset for OAI updater 
system."),
PrivateTable("logging", "loggingtable",
 "Data for various events (deletions, 
uploads, etc)."),
PrivateTable("oldimage", "oldimagetable",
 "Metadata on prior versions of 
uploaded images."),
-   # PrivateTable("filearchive", "filearchivetable",
-   # "Deleted image data"),
 
PublicTable("site_stats", "sitestatstable",
"A few statistics such as the page 
count."),
PublicTable("image", "imagetable",
"Metadata on current versions of 
uploaded media/files."),
-   # PublicTable("oldimage", "oldimagetable",
-   #"Metadata on prior versions of 
uploaded media/files."),
PublicTable("pagelinks", "pagelinkstable",
"Wiki page-to-page link records."),
PublicTable("categorylinks", "categorylinkstable",
@@ -138,9 +132,6 @@
"Wiki external URL link records."),
PublicTable("langlinks", "langlinkstable",
"Wiki interlanguage link records."),
-   # PublicTable("interwiki", "interwikitable",
-   #"Set of defined interwiki prefixes " +
-   #"and links for this wiki."),
PublicTable("user_groups", "usergroupstable", "User 
group assignments."),
PublicTable("category", "categorytable", "Category 
information."),
 
@@ -152,10 +143,6 @@
"Name/value pairs for pages."),
PublicTable("protected_titles", 
"protectedtitlestable",
"Nonexistent pages that have been 
protected."),
-   # PublicTable("revision", revisiontable",
-   #"Base per-revision data (does not 
include text)."), // safe?
-   # PrivateTable("text", "texttable",
-   #"Text blob storage. May be compressed, 
etc."), // ?
PublicTable("redirect", "redirecttable", "Redirect 
list"),
PublicTable("iwlinks", "iwlinkstable",
"Interwiki link tracking records"),
@@ -171,18 +158,17 @@
 
self._get_partnum_todo("abstractsdump"), self.wiki.db_name,
 
self.filepart.get_pages_per_filepart_abstract())]
 
-if self.filepart.parts_enabled():
-self.dump_items.append(RecombineAbstractDump(
-"abstractsdumprecombine", "Recombine extracted page abstracts 
for Yahoo",
-self.find_item_by_name('abstractsdump')))
+self.append_job_if_needed(RecombineAbstractDump(
+"abstractsdumprecombine", "Recombine extracted page abstracts for 
Yahoo",
+self.find_item_by_name('abstractsdump')))
 
 self.dump_items.append(XmlStub("xmlstubsdump", "First-pass for page 
XML data dumps",
self._get_partnum_todo("xmlstubsdump"),

self.filepart.get_pages_per_filepart_history()))
-

[MediaWiki-commits] [Gerrit] dumps: clean up construction of list of possible dump jobs f... - change (operations/dumps)

2015-11-09 Thread ArielGlenn (Code Review)
ArielGlenn has uploaded a new change for review.

  https://gerrit.wikimedia.org/r/252132

Change subject: dumps: clean up construction of list of possible dump jobs for 
wiki
..

dumps: clean up construction of list of possible dump jobs for wiki

move checks for exceptions for jobs to add, out to a single method
this is not the list of jobs that will necessarily be run but
the list of all jobs that would be run for a full dump of the wiki
instead of having all those checks inline in the code

Change-Id: I1a9a61b3ea654b0e4ff80cd00bd843f9e4b554cf
---
M xmldumps-backup/dumps/runner.py
1 file changed, 99 insertions(+), 102 deletions(-)


  git pull ssh://gerrit.wikimedia.org:29418/operations/dumps 
refs/changes/32/252132/1

diff --git a/xmldumps-backup/dumps/runner.py b/xmldumps-backup/dumps/runner.py
index 7e52e0b..c4ecdee 100644
--- a/xmldumps-backup/dumps/runner.py
+++ b/xmldumps-backup/dumps/runner.py
@@ -111,21 +111,15 @@
 "Data for blocks of IP addresses, 
ranges, and users."),
PrivateTable("archive", "archivetable",
 "Deleted page and revision data."),
-   # PrivateTable("updates", "updatestable",
-   #  "Update dataset for OAI updater 
system."),
PrivateTable("logging", "loggingtable",
 "Data for various events (deletions, 
uploads, etc)."),
PrivateTable("oldimage", "oldimagetable",
 "Metadata on prior versions of 
uploaded images."),
-   # PrivateTable("filearchive", "filearchivetable",
-   # "Deleted image data"),
 
PublicTable("site_stats", "sitestatstable",
"A few statistics such as the page 
count."),
PublicTable("image", "imagetable",
"Metadata on current versions of 
uploaded media/files."),
-   # PublicTable("oldimage", "oldimagetable",
-   #"Metadata on prior versions of 
uploaded media/files."),
PublicTable("pagelinks", "pagelinkstable",
"Wiki page-to-page link records."),
PublicTable("categorylinks", "categorylinkstable",
@@ -138,9 +132,6 @@
"Wiki external URL link records."),
PublicTable("langlinks", "langlinkstable",
"Wiki interlanguage link records."),
-   # PublicTable("interwiki", "interwikitable",
-   #"Set of defined interwiki prefixes " +
-   #"and links for this wiki."),
PublicTable("user_groups", "usergroupstable", "User 
group assignments."),
PublicTable("category", "categorytable", "Category 
information."),
 
@@ -152,10 +143,6 @@
"Name/value pairs for pages."),
PublicTable("protected_titles", 
"protectedtitlestable",
"Nonexistent pages that have been 
protected."),
-   # PublicTable("revision", revisiontable",
-   #"Base per-revision data (does not 
include text)."), // safe?
-   # PrivateTable("text", "texttable",
-   #"Text blob storage. May be compressed, 
etc."), // ?
PublicTable("redirect", "redirecttable", "Redirect 
list"),
PublicTable("iwlinks", "iwlinkstable",
"Interwiki link tracking records"),
@@ -171,18 +158,17 @@
 
self._get_partnum_todo("abstractsdump"), self.wiki.db_name,
 
self.filepart.get_pages_per_filepart_abstract())]
 
-if self.filepart.parts_enabled():
-self.dump_items.append(RecombineAbstractDump(
-"abstractsdumprecombine", "Recombine extracted page abstracts 
for Yahoo",
-self.find_item_by_name('abstractsdump')))
+self.append_job_if_needed(RecombineAbstractDump(
+"abstractsdumprecombine", "Recombine extracted page abstracts for 
Yahoo",
+self.find_item_by_name('abstractsdump')))
 
 self.dump_items.append(XmlStub("xmlstubsdump", "First-pass for page 
XML data dumps",
self._get_partnum_todo("xmlstubsdump"),

self.filepart.get