Max Semenik wrote:
>A month ago, PageImages extension was black-deployed, intended to
>automatically associate images with articles.

I looked at <https://www.mediawiki.org/wiki/Extension:PageImages> and I'm
still having difficulty understanding this extension's purpose. Is there a
related bug or request for comment (RFC) for this?

>select count(*), avg(page_len) from page where page_namespace=0 and
>page_is_redirect=0 and page_touched < '20121229000000';
>+----------+---------------+
>| count(*) | avg(page_len) |
>+----------+---------------+
>|   977568 |     3172.0948 |
>+----------+---------------+
>1 row in set (5 min 59.55 sec)

select count(*) from page where page_namespace=0 and page_is_redirect=1
and page_touched < '20120101000000';
+----------+
| count(*) |
+----------+
|       16 |
+----------+
1 row in set (26.61 sec)

I ran a script in December 2012 on the English Wikipedia that updated the
page_touched date of every redirect in NS:0 (and a few other namespaces, I
believe) where the page_touched date was not like '2012%'. I'd considered
running the same script on non-redirects. It turns out that if you take
the stored wikitext of pages and echo (post) it back at the wiki via the
edit action a few million times, you can discover some interesting bugs.

>Thus, I would like to populate this data with a script[3]. To reduce
>the scare, let me remark that these pages have almost no templates and
>are significantly smaller than average: 3172 bytes vs. 5673 so they
>should be mostly fast to parse.

I don't think there's any reason to be scared here.

MZMcBride



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to