Hi,

I am interested in performing analysis on recently created pages on English
Wikipedia.

One way to find recently created pages is downloading a meta-history file
for the English language, and filter through the XML, looking for pages
where the oldest revision is within the desired timespan.

Since this requires a library to parse through XML string data, I would
imagine this is much slower than a database query. Is page revision data
available in one of the SQL dumps which I could query for this use case?
Looking at the exported tables list
<https://meta.wikimedia.org/wiki/Data_dumps/What%27s_available_for_download#Database_tables>,
it does not look like it is. Maybe this is intentional?

Thanks,
Eric Andrew Lewis
ericandrewlewis.com
+1 610 715 8560
_______________________________________________
Xmldatadumps-l mailing list -- xmldatadumps-l@lists.wikimedia.org
To unsubscribe send an email to xmldatadumps-l-le...@lists.wikimedia.org

Reply via email to