Re: [Wikitech-l] Importing Wikipedia XML Dumps into MediaWiki

O. O. Fri, 13 Mar 2009 09:16:27 -0700

Mohamed Magdy wrote:
> I don't remember if I already mentioned this: you can split the xml
> file * into smaller pieces then import it using importDump.php.
> 
> Use a loop to make a file like this and then run it:
> #!/bin/bash
> php maintenance/importDump.php < /path/pagexml.1
> wait
> php maintenance/importDump.php < /path/pagexml.2
> ...
> 
> I haven't tried to start many php importDump.php processes working on
> different xml files simultaneously, will it work?
> 
> * = 
> http://blog.prashanthellina.com/2007/10/17/ways-to-process-and-use-wikipedia-dumps/


Thanks Mohamed – This is a good suggestion, but I am a bit vary to try 
it, because if I later have problems, I would not be sure if it is 
because I used this script to split the XML files.

I understand that the script looks OK, in that it simply splits the XML 
files at the “</page>” boundaries – but I don’t know a lot on how this 
would effect the final result.

Thanks again,

O. O.


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Importing Wikipedia XML Dumps into MediaWiki

Reply via email to