On Tue, Jul 21, 2009 at 11:22 AM, Chengbin Zheng<chengbinzh...@gmail.com> wrote: > On a side note, if parsing the XML gets you the static HTML version of > Wikipedia, why can't Wikimedia just parse it for us and save a lot of our > time (parsing and learning), and use that as the static HTML dump version?
I'd assume it was a performance issue to parse all the pages for all the dumps so often. It might have just used too much CPU to be worth it at the time. Parsing some individual pages can take 20 seconds or more, and there are millions of them (although most much faster to parse than that). I'm sure it could be reinstituted with some effort, though. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l